14 dependents
Package Description Downloads/month
A high-throughput and memory-efficient inference and serving engine for LLMs 9.4M
Structured Outputs 1.6M
A high-throughput and memory-efficient inference and serving engine for LLMs 143K
Wheels & Docker images for running vLLM on CPU-only systems, optimized for diffe... 31K
Large-scale LLM inference engine 7K
LLM-powered security log analyzer: detect threats & anomalies with zero regex — ... 5K
Dria SDK is for building and executing synthetic data generation pipelines on Dr... 4K
Wheels & Docker images for running vLLM on CPU-only systems, optimized for diffe... 3K
vLLM CPU inference engine (AVX512 + VNNI optimized) 3K
Wheels & Docker images for running vLLM on CPU-only systems, optimized for diffe... 3K
vLLM CPU inference engine (AVX512 optimized) 2K
A high-throughput and memory-efficient inference and serving engine for LLMs 437
A high-throughput and memory-efficient inference and serving engine for LLMs 375
Customize, control, and enhance LLM generation with logits processors, featuring... 342