Dependents of openai-harmony

35 dependents

Package	Description	Downloads/month
sglang	SGLang is a high-performance serving framework for large language models and mul...	287.7M
vllm	A high-throughput and memory-efficient inference and serving engine for LLMs	9.4M
vllm-tpu	A high-throughput and memory-efficient inference and serving engine for LLMs	143K
gpt-oss	A collection of reference inference implementations for gpt-oss by OpenAI	129K
lmdeploy	A toolset for compressing, deploying and serving LLM	123K
vllm-cpu	Wheels & Docker images for running vLLM on CPU-only systems, optimized for diffe...	31K
mlx-openai-server	A high-performance API server that provides OpenAI-compatible endpoints for MLX ...	22K
tensorrt-llm	TensorRT LLM provides users with an easy-to-use Python API to define Large Langu...	16K
transformerlab-inference	An open platform for training, serving, and evaluating large language model base...	15K
aphrodite-engine	Large-scale LLM inference engine	7K
sglang-kt	SGLang is a high-performance serving framework for large language models and mul...	4K
basic-agent-chat-loop	A flexible command-line chat loop framework for building AI agents with support ...	3K
vllm-cpu-avx512bf16	Wheels & Docker images for running vLLM on CPU-only systems, optimized for diffe...	3K
vllm-cpu-avx512vnni	vLLM CPU inference engine (AVX512 + VNNI optimized)	3K
vllm-cpu-amxbf16	Wheels & Docker images for running vLLM on CPU-only systems, optimized for diffe...	3K
rnow	rnow CLI - Reinforcement Learning platform command-line interface	2K
vllm-cpu-avx512	vLLM CPU inference engine (AVX512 optimized)	2K
krionis-pipeline	A fully local GPU poor, multimodal Retrieval-Augmented Generation (RAG) system ...	2K
furiosa-llm	FuriosaAI SDK	2K
infinity-parser2	INF Tech's open-source MLLMs for SOTA visual-language understanding and advanced...	1K
lazyllm-lmdeploy	A toolset for compressing, deploying and serving LLM	887
rag-llm-api-pipeline	A fully local GPU poor, multimodal Retrieval-Augmented Generation (RAG) system ...	496
vllm-kunlun	vLLM Kunlun3 backend plugin	464
vllm-hust	A high-throughput and memory-efficient inference and serving engine for LLMs	437
wxy-test	A high-throughput and memory-efficient inference and serving engine for LLMs	375
localtalk	A local/offline-capable voice assistant with speech recognition, LLM processing,...	324
mlx-gpt-oss	Minimal OpenAI-compatible server for GPT-OSS models on Apple Silicon	281
tritonllm	LLM Inference via Triton (Flexible & Modular): Focused on Kernel Optimization us...	243
power-sglang-cuda124	SGLang fork for ppc64le with CUDA 12.4 and Torch Triton support	186
str-message	Any Message Could be a String, for LLM Usage	180
aisolate-agent	AI Agent with intelligent planning and reasoning	144
vllm-usf	A high-throughput and memory-efficient inference and serving engine for LLMs	115
ai-companion-llm-backend	Add your description here	94
specforge		71
sglang-cpu	SGLang is a fast serving framework for large language models and vision language...	2