Machine Learning Performance Python Packages | PyPI Stats

adaptq

High-performance CPU KV-cache quantization engine for LLM inference (~10× speedup, 4× memory reduction) with Python & PyTorch support.

301 1 0