Turboquant Python Packages

quantcpp

LLM inference with 7x longer context. Pure C, zero dependencies. Lossless KV cache compression + single-header library.

38K 386 42

turboquant-vllm

TurboQuant KV cache compression plugin for vLLM — asymmetric K/V, 8 models validated, consumer GPUs

8K 46 5

turboquant

First open-source TurboQuant KV cache compression for LLM inference. Drop-in for HuggingFace. pip install turboquant.

4K 33 7

turbovec

A vector index built on TurboQuant, written in Rust with Python bindings

4K 529 42

turboquant-mlx-full

Extreme weight and KV cache compression for LLMs on Apple Silicon (MLX implementation of Google's TurboQuant)

3K 8 1

turboquant-space

TurboQuant (ICLR 2026) — SIMD-accelerated 4/8-bit quantization Space for ANN

2K 6 0

tqai

TurboQuant KV cache compression for local LLM inference

2K 1 0

turboquant-vectors

Compress embeddings 6x instantly with TurboQuant. First pip package using Google's TurboQuant (ICLR 2026) for vector search. 71.9% recall vs FAISS PQ 13.3%.

628 1 1

turboquant-hf

Near-optimal weight quantization for LLMs using the Google's TurboQuant algorithm

320 0 0

fused-turboquant

Fused Triton encode/decode kernels for TurboQuant KV cache compression, powered by Randomized Hadamard Transform.

208 8 0

langchain-turboquant

TurboQuant vector store for LangChain — 6x memory reduction with training-free quantization

171 1 2

turbokv

First open-source implementation of TurboQuant (arXiv 2504.19874) — 4-7x LLM KV cache compression

164 1 0

turboquant-impl

First open-source implementation of TurboQuant (arXiv 2504.19874) — 4-7x LLM KV cache compression

122 1 0

commitmind

CommitMind: Semantic search for Git commit history powered by TurboQuant vector compression (ICLR 2026). Search commits by meaning, not just keywords.

115 0 0

Search Packages