55 dependents
Package Description Downloads/month
SGLang is a high-performance serving framework for large language models and mul... 287.7M
A high-throughput and memory-efficient inference and serving engine for LLMs 9.4M
A high-throughput and memory-efficient inference and serving engine for LLMs 143K
Wheels & Docker images for running vLLM on CPU-only systems, optimized for diffe... 31K
24K
Go ahead and axolotl questions 20K
TensorRT LLM provides users with an easy-to-use Python API to define Large Langu... 16K
Mistral Voxtral STT/TTS adapter for Vox 15K
The robust European language model benchmark. 13K
Large-scale LLM inference engine 7K
The robust European language model benchmark. 5K
Wheels & Docker images for running vLLM on CPU-only systems, optimized for diffe... 3K
Universal Deep Learning Inference Engine — execute any AI model without model-sp... 3K
🚀 Pytorch Distributed native training library for LLMs/VLMs with OOTB Hugging Fa... 3K
vLLM CPU inference engine (AVX512 + VNNI optimized) 3K
Crilla is a simple way to introduce optimized single-GPU training into your proj... 3K
Wheels & Docker images for running vLLM on CPU-only systems, optimized for diffe... 3K
Push-to-talk transcription 2K
vLLM CPU inference engine (AVX512 optimized) 2K
OntoLearner: A Modular Python Library for Ontology Learning with LLMs https://py... 2K
A CLI client meant to provide the core features of the ChatGPT and Le Chat weba... 2K
ToolAgents is a lightweight and flexible framework for creating function-calling... 2K
Multi-Agent framework 2K
An easy-to-extend LLM annotator for robust, resumable data annotation. 901
AI Vulnerability Identification & Security Evaluation framework 881
General Information, model certifications, and benchmarks for nm-vllm enterprise... 666
A Human-in-the-Loop Workflow for Scientific Schema Mining with Large Language Mo... 640
Mistral Voxtral plugin for the cjm-transcription-plugin-system library - provide... 602
Mistral Voxtral plugin for the cjm-transcription-plugin-system library - provide... 546
vLLM Kunlun3 backend plugin 464
455
Voxtral audio processing and model implementation for Apple Silicon using MLX 438
A high-throughput and memory-efficient inference and serving engine for LLMs 437
PDF processing pipeline: remove headers/footers, convert to markdown, and genera... 410
Hexamind library to implement RAG solutions 402
A high-throughput and memory-efficient inference and serving engine for LLMs 375
A high-throughput and memory-efficient inference and serving engine for LLMs 344
A smart CLI friend 313
A simple CLI to transcribe Youtube videos or local audio/video files and produce... 264
Sparse AutoEncoder to decode Mistral LLM 262
Voxtral Mini Realtime speech-to-text in MLX 256
Add your description here 198
Slimmed release mirror of UniTrust for AEN and TruthPrInt. 192
A high-throughput and memory-efficient inference and serving engine for LLMs 176
Automatically generate commit messages from changes 172
A web scraping library based on LangChain which uses LLM and direct graph logic ... 150
A high-throughput and memory-efficient inference and serving engine for LLMs 132
A high-throughput and memory-efficient inference and serving engine for LLMs 115
Add your description here 94
use any llm api in a plug-and-play fashion 92