PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Blackwell Python Packages

Python packages with the GitHub topic blackwell. Sorted by relevance, with stars and monthly downloads.
sgl-project
sglang

SGLang is a high-performance serving framework for large language models and multimodal models.

306.7M 27K 6K
vllm-project
vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

8.9M 79K 16K
sgl-project
sglang-kernel

SGLang is a high-performance serving framework for large language models and multimodal models.

273K 27K 6K
sgl-project
sgl-kernel

SGLang is a high-performance serving framework for large language models and multimodal models.

254K 27K 6K
vllm-project
vllm-tpu

A high-throughput and memory-efficient inference and serving engine for LLMs

145K 79K 16K
NVIDIA
tensorrt-llm

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in a performant way.

16K 14K 2K
sgl-project
sglang-kt

SGLang is a high-performance serving framework for large language models and multimodal models.

4K 27K 6K
patrick-toulme
pyptx

A Python DSL to write Nvidia PTX for Hopper and Blackwell in JAX and PyTorch

3K 265 15
m96-chan
pygpukit

Minimal GPU runtime for Python - high-performance CUDA kernels, memory management, and LLM inference without heavy dependencies

3K 2 0
jpietek
penguin-burner

Nvidia ultimate undervolting companion on Linux. Now with a nice UI. Supports MSI Afterburner profile imports and LACT profile exports. Can automatically scan for the most optimal GPU VF curve and generate silent fan curves.

2K 31 0
thc1006
taiwan-asr-toolkit

Production-grade Traditional Chinese / Taiwan Mandarin speech-to-text. Qwen3-ASR + MediaTek Breeze-ASR-25, hot-word injection, LLM polish, speaker diarization. RTF up to 1554x on RTX 5090, 56 TDD tests.

759 1 0
sgl-project
dblcsgen

SGLang is a high-performance serving framework for large language models and multimodal models.

613 27K 6K
vllm-project
vllm-xft

A high-throughput and memory-efficient inference and serving engine for LLMs

485 79K 16K
vllm-project
vllm-acc

A high-throughput and memory-efficient inference and serving engine for LLMs

484 79K 16K
vllm-project
vllm-hust

A high-throughput and memory-efficient inference and serving engine for LLMs

480 79K 16K
vllm-project
ai-dynamo-vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

437 79K 16K
vllm-project
wxy-test

A high-throughput and memory-efficient inference and serving engine for LLMs

394 2K 1K
vllm-project
nextai-vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

378 79K 16K
vllm-project
vllm-consul

A high-throughput and memory-efficient inference and serving engine for LLMs

306 79K 16K
vllm-project
vllm-npu

A high-throughput and memory-efficient inference and serving engine for LLMs

281 79K 16K
vllm-project
vllm-musa

A high-throughput and memory-efficient inference and serving engine for LLMs

279 79K 16K
vllm-project
vllm-emissary

A high-throughput and memory-efficient inference and serving engine for LLMs

188 79K 16K
vllm-project
vllm-usf

A high-throughput and memory-efficient inference and serving engine for LLMs

166 79K 16K
egaoharu-kensei
flash-attention-triton

Cross-platform FlashAttention-2 Triton implementation for Turing+ GPUs with custom configuration mode

146 26 0
    • Data from PyPI, GitHub, ClickHouse, and BigQuery