Dependents of vllm - PyPI Stats

175 dependents

Package	Description	Downloads/month
turboquant-vllm	TurboQuant KV cache compression plugin for vLLM — asymmetric K/V, 8 models valid...	8K
bigcodebench	[ICLR'25] BigCodeBench: Benchmarking Code Generation Towards AGI	6K
text2text	Text2Text Language Modeling Toolkit	6K
rank-llm	RankLLM is a Python toolkit for reproducible information retrieval research usin...	5K
hipporag	[NeurIPS'24] HippoRAG is a novel RAG framework inspired by human long-term memor...	5K
coreason-runtime	The official zero-trust, high-throughput kinetic execution engine for the coreas...	4K
vllm-rbln	vLLM plugin for RBLN NPU	4K
qwerky-vllm-models	vLLM plugin for Qwerky AI MambaInLlama hybrid models	4K
londonaicentre-mesa-local	Serve MESA models locally	3K
h-adminsim	A Python package for simulating hospital administrative tasks.	3K
vllm-spyre	vLLM plugin for Spyre hardware support	3K
orpheus-speech	Towards Human-Sounding Speech	3K
factory-sdk	factory SDK	3K
finance-data-llm	SEC filings and Earnings call transcripts data	3K
cirilla	Crilla is a simple way to introduce optimized single-GPU training into your proj...	3K
llm-optimized-inference		2K
bentoml-unsloth	The easiest way to serve AI apps and models - Build Model Inference APIs, Job qu...	2K
dataflow-421		2K
roboreason	Roboreason package	2K
gimbench	Benchmarking the guided infilling models.	2K
arbor-ai	A framework for optimizing DSPy programs with RL	2K
agent-as-annotators	Agent-as-Annotators: Structured Distillation of Web Agent Capabilities	1K
fasr-asr-qwen3	Qwen3 ASR model for fasr	1K
vllm-tgis-adapter	vLLM adapter for a TGIS-compatible grpc server	1K
sendnn-inference	vLLM plugin for Spyre hardware support	1K
galadriel-node		1K
llama-index-embeddings-vllm	llama-index embeddings vllm integration	1K
auralis	This is a faster implementation for TTS models, to be used in highly async envir...	1K
patientsim	An official repository for PatientSim package.	1K
llm-engines	A unified inference engine for large language models (LLMs) including open-sourc...	1K
agent-odyssey	An infinitely scalable text world for evaluating long-term memory in LLM agents	983
topicgpt-python	Official implementation of TopicGPT: A Prompt-based Topic Modeling Framework (NA...	947
gfmrag	Graph Foundation Model for Retrieval Augmented Generation	945
flute-kernel		915
llm-annotator	An easy-to-extend LLM annotator for robust, resumable data annotation.	901
happy-vllm	happy_vllm is a REST API for vLLM, production ready	894
vllm-speculative-autoconfig	Automatic configuration planner for vLLM - Eliminate the guesswork of configurin...	891
llmlite	A library helps to chat with all kinds of LLMs consistently.	884
twinweaver	Converting longitudinal patient data into text for LLM-based event prediction an...	869
invokerl	Hackable RL post-training for LLMs	808
oat-llm	🌾 OAT: A research-friendly framework for LLM online alignment, including reinfor...	788
llamp	LLAMP - Large Language Model for Planning	736
vllm-lens	vLLM plugin for interacting with activations during inference	715
async-chat-engine	An asynchronous chat engine using vLLM with a async producer-consumer pattern.	708
greaterprompt	A Unified, Customizable, and High-Performing Open-Source Toolkit for Prompt Opti...	705
thinkbooster	ThinkBooster: a unified framework for test-time compute scaling of LLM reasoning	683
papertuner	A package for creating ML research assistant models through paper dataset creati...	673
embedl-models	Efficient deep learning models for the edge.	615
vllm-canon	vLLM plugin: out-of-tree registration of canon-layer architectures (e.g. LlamaCa...	576
protollm-api		568