175 dependents
Package Description Downloads/month
TurboQuant KV cache compression plugin for vLLM — asymmetric K/V, 8 models valid... 8K
[ICLR'25] BigCodeBench: Benchmarking Code Generation Towards AGI 6K
Text2Text Language Modeling Toolkit 6K
RankLLM is a Python toolkit for reproducible information retrieval research usin... 5K
[NeurIPS'24] HippoRAG is a novel RAG framework inspired by human long-term memor... 5K
The official zero-trust, high-throughput kinetic execution engine for the coreas... 4K
vLLM plugin for RBLN NPU 4K
vLLM plugin for Qwerky AI MambaInLlama hybrid models 4K
Serve MESA models locally 3K
A Python package for simulating hospital administrative tasks. 3K
vLLM plugin for Spyre hardware support 3K
Towards Human-Sounding Speech 3K
factory SDK 3K
SEC filings and Earnings call transcripts data 3K
Crilla is a simple way to introduce optimized single-GPU training into your proj... 3K
2K
The easiest way to serve AI apps and models - Build Model Inference APIs, Job qu... 2K
2K
Roboreason package 2K
Benchmarking the guided infilling models. 2K
A framework for optimizing DSPy programs with RL 2K
Agent-as-Annotators: Structured Distillation of Web Agent Capabilities 1K
Qwen3 ASR model for fasr 1K
vLLM adapter for a TGIS-compatible grpc server 1K
vLLM plugin for Spyre hardware support 1K
1K
llama-index embeddings vllm integration 1K
This is a faster implementation for TTS models, to be used in highly async envir... 1K
An official repository for PatientSim package. 1K
A unified inference engine for large language models (LLMs) including open-sourc... 1K
An infinitely scalable text world for evaluating long-term memory in LLM agents 983
Official implementation of TopicGPT: A Prompt-based Topic Modeling Framework (NA... 947
Graph Foundation Model for Retrieval Augmented Generation 945
915
An easy-to-extend LLM annotator for robust, resumable data annotation. 901
happy_vllm is a REST API for vLLM, production ready 894
Automatic configuration planner for vLLM - Eliminate the guesswork of configurin... 891
A library helps to chat with all kinds of LLMs consistently. 884
Converting longitudinal patient data into text for LLM-based event prediction an... 869
Hackable RL post-training for LLMs 808
🌾 OAT: A research-friendly framework for LLM online alignment, including reinfor... 788
LLAMP - Large Language Model for Planning 736
vLLM plugin for interacting with activations during inference 715
An asynchronous chat engine using vLLM with a async producer-consumer pattern. 708
A Unified, Customizable, and High-Performing Open-Source Toolkit for Prompt Opti... 705
ThinkBooster: a unified framework for test-time compute scaling of LLM reasoning 683
A package for creating ML research assistant models through paper dataset creati... 673
Efficient deep learning models for the edge. 615
vLLM plugin: out-of-tree registration of canon-layer architectures (e.g. LlamaCa... 576
568