PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Search Packages

Find Python packages by name, description, GitHub topic, or filter by metrics
ModelCloud
gptqmodel

LLM model quantization (compression) toolkit with HW acceleration support for Nvidia, AMD, Intel GPU and Intel/AMD/Apple CPU via HF, vLLM, and SGLang.

38K 1K 185
intel
neural-compressor

SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, and ONNX Runtime

22K 3K 304
intel
neural-compressor-pt

SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, and ONNX Runtime

1K 3K 304
intel
neural-compressor-tf

Repository of Intel® Neural Compressor

739 3K 304
stef41
quantbenchx

Quantization quality analyzer - benchmark GGUF/GPTQ/AWQ quantization accuracy.

716 1 0
lpalbou
model-quantizer

A tool for quantizing large language models

530 2 0
intel
neural-compressor-full

SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, and ONNX Runtime

476 3K 304
matlok-ai
bampe-weights

This repository is for profiling, extracting, visualizing and reusing generative AI weights to hopefully build more accurate AI models and audit/scan weights at rest to identify knowledge domains for risk(s).

463 10 0
bobazooba
xllm

Simple & Cutting Edge LLM Finetuning

458 408 21
intel
neural-solution

Repository of Intel® Neural Compressor

384 3K 304
intel
lpot

Repository of Intel® Low Precision Optimization Tool

384 3K 302
intel
neural-insights

SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, and ONNX Runtime

340 3K 304
ShipItAndPray
quantcrush

Crush any LLM to 6x smaller in one command. GGUF, GPTQ, AWQ.

121 4 1
intel
ilit

SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, and ONNX Runtime

94 3K 304
intel
neural-compressor-3x-tf

SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, and ONNX Runtime

14 3K 304
intel
neural-compressor-3x-pt

SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, and ONNX Runtime

13 3K 304
intel
neural-compressor-3x-ort

SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, and ONNX Runtime

6 3K 304
    • Data from PyPI, GitHub, ClickHouse, and BigQuery