Pruning Python Packages

nncf

Neural Network Compression Framework for enhanced OpenVINO™ inference

456K 1K 293

tensorflow-model-optimization

A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.

105K 2K 347

aimet-torch

AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.

29K 3K 450

torch-pruning

[CVPR 2023] DepGraph: Towards Any Structural Pruning; LLMs, Vision Foundation Models, etc.

23K 3K 378

neural-compressor

SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, and ONNX Runtime

22K 3K 304

cozempic

Context cleaning for Claude Code — prune bloated sessions, protect Agent Teams from context loss, auto-guard with tiered pruning

20K 277 17

aimet-onnx

AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.

19K 3K 450

tf-model-optimization-nightly

A suite of tools that users, both novice and advanced can use to optimize machine learning models for deployment and execution.

10K 2K 347

deepsparse

Sparsity-aware deep learning inference runtime for CPUs

6K 3K 191

deepsparse-ent

Sparsity-aware deep learning inference runtime for CPUs

4K 3K 191

sparseml

Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models

3K 2K 156

sparsezoo

Neural network model repository for highly sparse and sparse-quantized models with matching sparsification recipes

2K 388 28

sconce

E2E AutoML Model Compression Package

2K 45 4

micronet

A model compression and deploy lib.

2K 2K 477

neural-compressor-pt

SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, and ONNX Runtime

1K 3K 304

fasterai

FasterAI: Prune and Distill your models with FastAI and PyTorch

1K 261 19

delve

PyTorch model training and layer saturation monitor

1K 83 13

sparsify

ML model optimization product to accelerate inference.

1K 325 31

mmrazor

OpenMMLab Model Compression Toolbox and Benchmark.

901 2K 243

torch-optim

PyTorch models optimization by neural network pruning

812 3 1

neural-compressor-tf

Repository of Intel® Neural Compressor

739 3K 304

heflwr

HeFlwr: Federated Learning for Heterogeneous Devices

541 125 13

only-train-once

Only Train Once (OTO): Automatic One-Shot General DNN Training and Compression Framework

511 311 48

neural-compressor-full

SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, and ONNX Runtime

476 3K 304

Search Packages