PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Search Packages

Find Python packages by name, description, GitHub topic, or filter by metrics
vllm-project
vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

9.4M 79K 16K
basetenlabs
truss

The simplest way to serve AI/ML models in production

632K 1K 102
vllm-project
vllm-omni

A framework for efficient model inference with omni-modality models

477K 5K 867
basetenlabs
truss-transfer

The simplest way to serve AI/ML models in production

300K 1K 102
bentoml
bentoml

The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!

198K 9K 959
basetenlabs
baseten-performance-client

The simplest way to serve AI/ML models in production

182K 1K 102
vllm-project
vllm-tpu

A high-throughput and memory-efficient inference and serving engine for LLMs

143K 79K 16K
kserve
kserve

Standardized Distributed Generative and Predictive AI Inference Platform for Scalable, Multi-Framework Deployment on Kubernetes

114K 5K 1K
mlrun
mlrun

MLRun is an open source MLOps platform for quickly building and managing continuous ML applications across their lifecycle. MLRun integrates into your development and CI/CD environment and automates the delivery of production data, ML pipelines, and online applications.

53K 2K 303
tensorchord
envd

🏕️ Reproducible development environment for humans and agents

39K 2K 167
mlrun
mlrun-pipelines-kfp-common

MLRun is an open source MLOps platform for quickly building and managing continuous ML applications across their lifecycle. MLRun integrates into your development and CI/CD environment and automates the delivery of production data, ML pipelines, and online applications.

36K 2K 303
mlrun
mlrun-pipelines-kfp-v1-8

MLRun is an open source MLOps platform for quickly building and managing continuous ML applications across their lifecycle. MLRun integrates into your development and CI/CD environment and automates the delivery of production data, ML pipelines, and online applications.

36K 2K 303
mosecorg
mosec

A high-performance ML model serving framework, offers dynamic batching and CPU/GPU pipelines to fully exploit your compute machine

18K 899 72
clearml
clearml-serving

ClearML - Model-Serving Orchestration and Repository Solution

11K 164 50
NimbleBoxAI
nbox

The official python package for NimbleBox. Exposes all APIs as CLIs and contains modules to make ML 🌸

9K 87 13
openvinotoolkit
ovmsclient

A scalable inference server for models optimized with OpenVINO™

9K 870 251
predibase
lorax-client

Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

8K 4K 312
vllm-project
vllm-ascend

Community maintained hardware plugin for vLLM on Ascend

7K 2K 1K
google
google-jetstream

JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).

4K 432 64
notAI-tech
fastdeploy

Deploy DL/ ML inference pipelines with minimal extra code.

4K 103 17
logicalclocks
hsml

Hopsworks Machine Learning Api 🚀 Model management with a model registry and model serving

3K 8 20
FedML-AI
fedml

FEDML - The unified and scalable ML library for large-scale distributed training, model serving, and federated learning. FEDML Launch, a cross-cloud scheduler, further enables running any AI jobs on any GPU cloud or on-premise cluster. Built on this library, TensorOpera AI (https://TensorOpera.ai) is your generative AI platform at scale.

3K 4K 766
aniketmaurya
chitra

A multi-functional library for full-stack Deep Learning. Simplifies Model Building, API development, and Model Deployment.

3K 234 37
kubeflow
kfserving

Standardized Distributed Generative and Predictive AI Inference Platform for Scalable, Multi-Framework Deployment on Kubernetes

3K 5K 1K
    • Data from PyPI, GitHub, ClickHouse, and BigQuery