PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Search Packages

Find Python packages by name, description, GitHub topic, or filter by metrics
Alberto-Codes
turboquant-vllm

TurboQuant KV cache compression plugin for vLLM — asymmetric K/V, 8 models validated, consumer GPUs

8K 46 5
tanavc1
llm-autotune

Zero-config local LLM optimization for Ollama, LM Studio, and Apple Silicon MLX. Reduces TTFT by 40%, wall time for local agents by 46%, and RAM usage by 3x.

7K 24 1
brontoguana
krasis

Krasis is no longer distributed via PyPI. Install from GitHub: https://github.com/brontoguana/krasis

5K 447 22
BenevolentJoker-JohnL
sollol

Super Ollama Load Balancer - Performance-aware routing for distributed Ollama deployments with Ray, Dask, and adaptive metrics

2K 4 2
EfficientContext
contextpilot

Fast Long-Context Inference via Context Reuse

1K 81 6
wild-edge
wildedge-sdk

Python SDK for WildEdge

663 13 1
alibaba
torch-quant

BladeDISC is an end-to-end DynamIc Shape Compiler project for machine learning workloads.

399 924 168
ManuelSLemos
rabbitllm

Run 70B+ LLMs on a single 4GB GPU — no quantization required. Layer-streaming inference for consumer hardware.

344 53 9
ZFTurbo
kito

Optimize layers structure of Keras model to reduce computation time

319 157 18
saikoushiknalubola
thinkrouter

Cut LLM reasoning-token costs by 60% with one line of code

303 2 0
fabriziopfannl
llm-autobatch

Turn single LLM calls into fast micro-batches. Rust core, Python API.

82 4 0
    • Data from PyPI, GitHub, ClickHouse, and BigQuery