Qwen3 Python Packages

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

9.4M 79K 16K

ms-swift

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.6, DeepSeek-R1, GLM-5.1, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Gemma4, Llava, Phi4, ...) (AAAI 2025).

171K 14K 1K

vllm-tpu

A high-throughput and memory-efficient inference and serving engine for LLMs

143K 79K 16K

hud-python

OSS RL environment + evals toolkit

100K 248 57

qwen3-embed

Lightweight ONNX inference for Qwen3 embedding and reranking models

11K 2 0

nemo-automodel

🚀 Pytorch Distributed native training library for LLMs/VLMs with OOTB Hugging Face support

3K 479 140

steadytext

Deterministic text generation and embeddings with zero configuration

1K 43 2

deepsearcher

None

728 8K 752

obsidian-umbra

Turn any Obsidian vault into a Zettelkasten graph — locally, with a dozen years of notes in minutes. 4-phase pipeline: daily splitter (Qwen3-4B) → semantic backlinks (Potion-32M) → keyword linker → synonym clustering (GTE-large + HDBSCAN). Zero cloud.

660 3 0

german-ocr

High-performance German document OCR - Local & Cloud with GPU/CPU support

439 94 6

vllm-hust

A high-throughput and memory-efficient inference and serving engine for LLMs

437 79K 16K

ggufloader

GGUF Loader with its Agentic Mode, and floating button, ai Models | Open Source & Offline. Mistral, Deepseek, llama, gemma, qwen

390 42 11

wxy-test

A high-throughput and memory-efficient inference and serving engine for LLMs

375 2K 1K

kakeyalattice

Discrete Kakeya cover for LLM KV cache: D4/E8 nested-lattice quantisation realising a Kakeya-style tube-cover over the direction sphere. 2.4x-2.8x compression at <1% perplexity loss on Qwen3, Llama-3, DeepSeek, GLM-4, Gemma. Drop-in transformers.DynamicCache. pip install kakeyalattice.

365 7 2

vllm-xft

A high-throughput and memory-efficient inference and serving engine for LLMs

345 79K 16K

ai-dynamo-vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

344 79K 16K

vllm-acc

A high-throughput and memory-efficient inference and serving engine for LLMs

342 79K 16K

nextai-vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

273 79K 16K

metascreener

Open-source multi-LLM ensemble tool for systematic review workflows

256 1K 48

vllm-consul

A high-throughput and memory-efficient inference and serving engine for LLMs

219 79K 16K

vllm-npu

A high-throughput and memory-efficient inference and serving engine for LLMs

209 79K 16K

vllm-musa

A high-throughput and memory-efficient inference and serving engine for LLMs

194 79K 16K

vllm-rocm

A high-throughput and memory-efficient inference and serving engine for LLMs

176 79K 16K

vllm-emissary

A high-throughput and memory-efficient inference and serving engine for LLMs

132 79K 16K

Search Packages