PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Search Packages

Find Python packages by name, description, GitHub topic, or filter by metrics
vllm-project
vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

9.4M 79K 16K
modelscope
ms-swift

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.6, DeepSeek-R1, GLM-5.1, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Gemma4, Llava, Phi4, ...) (AAAI 2025).

171K 14K 1K
vllm-project
vllm-tpu

A high-throughput and memory-efficient inference and serving engine for LLMs

143K 79K 16K
hud-evals
hud-python

OSS RL environment + evals toolkit

100K 248 57
n24q02m
qwen3-embed

Lightweight ONNX inference for Qwen3 embedding and reranking models

11K 2 0
NVIDIA
nemo-automodel

🚀 Pytorch Distributed native training library for LLMs/VLMs with OOTB Hugging Face support

3K 479 140
julep-ai
steadytext

Deterministic text generation and embeddings with zero configuration

1K 43 2
zilliztech
deepsearcher

None

728 8K 752
jimnoneill
obsidian-umbra

Turn any Obsidian vault into a Zettelkasten graph — locally, with a dozen years of notes in minutes. 4-phase pipeline: daily splitter (Qwen3-4B) → semantic backlinks (Potion-32M) → keyword linker → synonym clustering (GTE-large + HDBSCAN). Zero cloud.

660 3 0
Keyvanhardani
german-ocr

High-performance German document OCR - Local & Cloud with GPU/CPU support

439 94 6
vllm-project
vllm-hust

A high-throughput and memory-efficient inference and serving engine for LLMs

437 79K 16K
GGUFloader
ggufloader

GGUF Loader with its Agentic Mode, and floating button, ai Models | Open Source & Offline. Mistral, Deepseek, llama, gemma, qwen

390 42 11
vllm-project
wxy-test

A high-throughput and memory-efficient inference and serving engine for LLMs

375 2K 1K
FluffyAIcode
kakeyalattice

Discrete Kakeya cover for LLM KV cache: D4/E8 nested-lattice quantisation realising a Kakeya-style tube-cover over the direction sphere. 2.4x-2.8x compression at <1% perplexity loss on Qwen3, Llama-3, DeepSeek, GLM-4, Gemma. Drop-in transformers.DynamicCache. pip install kakeyalattice.

365 7 2
vllm-project
vllm-xft

A high-throughput and memory-efficient inference and serving engine for LLMs

345 79K 16K
vllm-project
ai-dynamo-vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

344 79K 16K
vllm-project
vllm-acc

A high-throughput and memory-efficient inference and serving engine for LLMs

342 79K 16K
vllm-project
nextai-vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

273 79K 16K
ChaokunHong
metascreener

Open-source multi-LLM ensemble tool for systematic review workflows

256 1K 48
vllm-project
vllm-consul

A high-throughput and memory-efficient inference and serving engine for LLMs

219 79K 16K
vllm-project
vllm-npu

A high-throughput and memory-efficient inference and serving engine for LLMs

209 79K 16K
vllm-project
vllm-musa

A high-throughput and memory-efficient inference and serving engine for LLMs

194 79K 16K
vllm-project
vllm-rocm

A high-throughput and memory-efficient inference and serving engine for LLMs

176 79K 16K
vllm-project
vllm-emissary

A high-throughput and memory-efficient inference and serving engine for LLMs

132 79K 16K
    • Data from PyPI, GitHub, ClickHouse, and BigQuery