304 dependents
| Package | Description | Downloads/month |
|---|---|---|
| SGLang is a high-performance serving framework for large language models and mul... | 287.7M | |
| A high-throughput and memory-efficient inference and serving engine for LLMs | 9.4M | |
| FlashInfer: Kernel Library for LLM Serving | 4M | |
| Ready-to-use OCR with 80+ supported languages and all popular writing scripts in... | 2.7M | |
| DeepSpeed is a deep learning optimization library that makes distributed trainin... | 1.3M | |
| A static type analyzer for Python code | 1M | |
| FlashAttention-3 forward | 521K | |
| Neural Network Compression Framework for enhanced OpenVINO™ inference | 456K | |
| SevenNet - a graph neural network interatomic potential package supporting effic... | 445K | |
| A unified library of SOTA model optimization techniques like quantization, pruni... | 376K | |
| cortical is the framework for building fabric architectures | 325K | |
| A pytorch quantization backend for optimum | 267K | |
| ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22... | 233K | |
| Setuptools extension to build and package CMake projects | 204K | |
| OpenEquivariance: a fast, open-source GPU JIT kernel generator for the Clebsch-G... | 184K | |
| a tiny package for fast python c++ binding build. | 157K | |
| A high-throughput and memory-efficient inference and serving engine for LLMs | 143K | |
| Causal depthwise conv1d in CUDA, with a PyTorch interface | 131K | |
| Mamba SSM architecture | 125K | |
| CUDA accelerated rasterization of gaussian splatting | 98K | |
| Making it easier to work with shaders | 95K | |
| Late Interaction Models Training & Retrieval | 71K | |
| AMD Quark is a comprehensive cross-platform toolkit designed to simplify and enh... | 65K | |
| Python library for generating high-performance implementations of stencil kernel... | 57K | |
| 56K | ||
| Google Fonts Tools is a set of command-line tools for testing font projects | 41K | |
| LLM model quantization (compression) toolkit with HW acceleration support for Nv... | 38K | |
| AutoCoder: AutoCoder | 37K | |
| All-in-one repository for state-of-the-art NeRFs | 34K | |
| Wheels & Docker images for running vLLM on CPU-only systems, optimized for diffe... | 31K | |
| TensorRT LLM provides users with an easy-to-use Python API to define Large Langu... | 16K | |
| Official repository of the xLSTM. | 15K | |
| Compiler for color fonts | 12K | |
| ByzerLLM: Byzer LLM | 10K | |
| Python extension language using accelerators | 10K | |
| 9K | ||
| Jolt is a task execution tool designed for software development tasks. It can bu... | 8K | |
| TorchANI 2.0 is an open-source library that supports training, development, and ... | 8K | |
| Unofficial implementation of Titans, SOTA memory for transformers, in Pytorch | 7K | |
| Large-scale LLM inference engine | 7K | |
| Lightweight, open-source, high-performance Yolo implementation | 7K | |
| A fast inference library for running LLMs locally on modern consumer-class GPUs | 6K | |
| This is the module for detecting and classifying text on rama pictures | 6K | |
| Open-source python package for multicomponent multiphase equilibrium CALPHAD cal... | 5K | |
| FlashAttention-3 | 5K | |
| Fast Hadamard transform in CUDA, with a PyTorch interface | 5K | |
| Your virtual engineering laboratory: An all-in-one package for sensor simulation... | 5K | |
| Compare two fonts | 5K | |
| Use late-interaction multi-modal models such as ColPali in just a few lines of c... | 5K | |
| Deep learning backtracing and explainability toolkit | 4K |