55 dependents
| Package | Description | Downloads/month |
|---|---|---|
| SGLang is a high-performance serving framework for large language models and mul... | 287.7M | |
| A high-throughput and memory-efficient inference and serving engine for LLMs | 9.4M | |
| A high-throughput and memory-efficient inference and serving engine for LLMs | 143K | |
| Wheels & Docker images for running vLLM on CPU-only systems, optimized for diffe... | 31K | |
| 24K | ||
| Go ahead and axolotl questions | 20K | |
| TensorRT LLM provides users with an easy-to-use Python API to define Large Langu... | 16K | |
| Mistral Voxtral STT/TTS adapter for Vox | 15K | |
| The robust European language model benchmark. | 13K | |
| Large-scale LLM inference engine | 7K | |
| The robust European language model benchmark. | 5K | |
| Wheels & Docker images for running vLLM on CPU-only systems, optimized for diffe... | 3K | |
| Universal Deep Learning Inference Engine — execute any AI model without model-sp... | 3K | |
| 🚀 Pytorch Distributed native training library for LLMs/VLMs with OOTB Hugging Fa... | 3K | |
| vLLM CPU inference engine (AVX512 + VNNI optimized) | 3K | |
| Crilla is a simple way to introduce optimized single-GPU training into your proj... | 3K | |
| Wheels & Docker images for running vLLM on CPU-only systems, optimized for diffe... | 3K | |
| Push-to-talk transcription | 2K | |
| vLLM CPU inference engine (AVX512 optimized) | 2K | |
| OntoLearner: A Modular Python Library for Ontology Learning with LLMs https://py... | 2K | |
| A CLI client meant to provide the core features of the ChatGPT and Le Chat weba... | 2K | |
| ToolAgents is a lightweight and flexible framework for creating function-calling... | 2K | |
| Multi-Agent framework | 2K | |
| An easy-to-extend LLM annotator for robust, resumable data annotation. | 901 | |
| AI Vulnerability Identification & Security Evaluation framework | 881 | |
| General Information, model certifications, and benchmarks for nm-vllm enterprise... | 666 | |
| A Human-in-the-Loop Workflow for Scientific Schema Mining with Large Language Mo... | 640 | |
| Mistral Voxtral plugin for the cjm-transcription-plugin-system library - provide... | 602 | |
| Mistral Voxtral plugin for the cjm-transcription-plugin-system library - provide... | 546 | |
| vLLM Kunlun3 backend plugin | 464 | |
| 455 | ||
| Voxtral audio processing and model implementation for Apple Silicon using MLX | 438 | |
| A high-throughput and memory-efficient inference and serving engine for LLMs | 437 | |
| PDF processing pipeline: remove headers/footers, convert to markdown, and genera... | 410 | |
| Hexamind library to implement RAG solutions | 402 | |
| A high-throughput and memory-efficient inference and serving engine for LLMs | 375 | |
| A high-throughput and memory-efficient inference and serving engine for LLMs | 344 | |
| A smart CLI friend | 313 | |
| A simple CLI to transcribe Youtube videos or local audio/video files and produce... | 264 | |
| Sparse AutoEncoder to decode Mistral LLM | 262 | |
| Voxtral Mini Realtime speech-to-text in MLX | 256 | |
| Add your description here | 198 | |
| Slimmed release mirror of UniTrust for AEN and TruthPrInt. | 192 | |
| A high-throughput and memory-efficient inference and serving engine for LLMs | 176 | |
| Automatically generate commit messages from changes | 172 | |
| A web scraping library based on LangChain which uses LLM and direct graph logic ... | 150 | |
| A high-throughput and memory-efficient inference and serving engine for LLMs | 132 | |
| A high-throughput and memory-efficient inference and serving engine for LLMs | 115 | |
| Add your description here | 94 | |
| use any llm api in a plug-and-play fashion | 92 |