49 dependents
Package Description Downloads/month
SGLang is a high-performance serving framework for large language models and mul... 287.7M
A high-throughput and memory-efficient inference and serving engine for LLMs 9.4M
A high-throughput and memory-efficient inference and serving engine for LLMs 143K
A toolset for compressing, deploying and serving LLM 123K
a simple and powerful tool to get things done with AI 79K
Wheels & Docker images for running vLLM on CPU-only systems, optimized for diffe... 31K
TensorRT LLM provides users with an easy-to-use Python API to define Large Langu... 16K
MindRoot AI Agent Framework 14K
Accelerate, Optimize performance with streamlined training and serving options w... 9K
Large-scale LLM inference engine 7K
The official zero-trust, high-throughput kinetic execution engine for the coreas... 4K
SGLang is a high-performance serving framework for large language models and mul... 4K
Python component of using Briton 4K
Wheels & Docker images for running vLLM on CPU-only systems, optimized for diffe... 3K
Offline voice agent framework for robots. 3K
Open-source framework for building AI-powered apps in JavaScript, Go, and Python... 3K
fastllm是后端无依赖的高性能大模型推理库。同时支持张量并行推理稠密模型和混合模式推理MOE模型,任意10G以上显卡即可推理满血DeepSeek。双路900... 3K
vLLM CPU inference engine (AVX512 + VNNI optimized) 3K
Wheels & Docker images for running vLLM on CPU-only systems, optimized for diffe... 3K
vLLM CPU inference engine (AVX512 optimized) 2K
FuriosaAI SDK 2K
INF Tech's open-source MLLMs for SOTA visual-language understanding and advanced... 1K
A complete terminal implementation of Anthropic's Claude. 1K
useful utilities for prompt engineering 1K
A toolset for compressing, deploying and serving LLM 887
Siada CLI is a Ai Pair Programming Tool in terminal 689
General Information, model certifications, and benchmarks for nm-vllm enterprise... 666
vLLM Kunlun3 backend plugin 464
A high-throughput and memory-efficient inference and serving engine for LLMs 437
A high-throughput and memory-efficient inference and serving engine for LLMs 375
353
A high-throughput and memory-efficient inference and serving engine for LLMs 344
JAX backend for SGL 295
Modular Multimodal Intelligent Reformatting and Augmentation Generation Engine -... 260
A tool for LLM agent conversations 219
SGLang is yet another fast serving framework for large language models and visio... 209
SkillEngine — framework-agnostic skills engine for LLM agents. Claude Code-like ... 192
A minimal wrapper for the google gemini (google-genai) API 189
SGLang fork for ppc64le with CUDA 12.4 and Torch Triton support 186
A high-throughput and memory-efficient inference and serving engine for LLMs 176
Convert infrastructure scans into various output formats such as Markdown tables... 151
A high-throughput and memory-efficient inference and serving engine for LLMs 132
A high-throughput and memory-efficient inference and serving engine for LLMs 115
A high-throughput and memory-efficient inference and serving engine for LLMs 80
Inferencing and Training Large Language Model Tasks 73
Genkit AI Framework 70
An agent framework using LLMs 56
A high-throughput and memory-efficient inference and serving engine for LLMs 42
SGLang is a fast serving framework for large language models and vision language... 2