PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Gpu Python Packages

Python packages with the GitHub topic gpu. Sorted by relevance, with stars and monthly downloads.
pytorch
torch

Tensors and Dynamic neural networks in Python with strong GPU acceleration

84.8M 100K 28K
catboost
catboost

A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

6.4M 9K 1K
wookayin
gpustat

📊 A simple command-line utility for querying and monitoring GPU status

5.7M 4K 286
apache
apache-tvm-ffi

Open ABI and FFI for Machine Learning Systems

4.7M 388 75
NVIDIA
nvidia-cutlass-dsl

CUDA Templates and Python DSLs for High-Performance Linear Algebra

4.1M 10K 2K
flashinfer-ai
flashinfer-python

FlashInfer: Kernel Library for LLM Serving

4.1M 6K 948
NVIDIA
nvidia-cutlass-dsl-libs-base

CUDA Templates and Python DSLs for High-Performance Linear Algebra

3.5M 10K 2K
cupy
cupy-cuda12x

NumPy & SciPy for GPU

3.4M 11K 1K
ashvardanian
stringzilla

Up to 100x faster strings for C, C++, CUDA, Python, Rust, Swift, JS, & Go, leveraging NEON, AVX2, AVX-512, SVE, GPGPU, & SWAR to accelerate search, hashing, sorting, edit distances, sketches, and memory ops 🦖

3.2M 3K 124
flashinfer-ai
flashinfer-cubin

FlashInfer: Kernel Library for LLM Serving

2.7M 6K 948
skypilot-org
skypilot

Run, manage, and scale AI workloads on any AI infrastructure. Use one system to access & manage all AI compute (Kubernetes, Slurm, 20+ clouds, on-prem).

1.9M 10K 1K
isl-org
open3d

Open3D: A Modern Library for 3D Data Processing

1.7M 14K 3K
meta-pytorch
torchrec

Pytorch domain library for recommendation systems

1.5M 3K 642
pytorch
torch-model-archiver

Serve, optimize and scale PyTorch models in production

1.3M 4K 886
deepspeedai
deepspeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

1.3M 42K 5K
nvidia
cuda-tile

cuTile is a programming model for writing parallel kernels for NVIDIA GPUs

1.1M 2K 134
NVIDIA
warp-lang

A Python framework for GPU-accelerated simulation, robotics, and machine learning.

975K 7K 494
fastai
fastai

The fastai deep learning library

971K 28K 8K
PennyLaneAI
pennylane-lightning

The Lightning plugin ecosystem provides fast quantum state-vector and tensor network simulators written in C++ for use with PennyLane.

969K 136 52
intel-analytics
ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, DeepSeek, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, vLLM, DeepSpeed, Axolotl, etc.

824K 9K 1K
runpod
runpod

🐍 | Python library for RunPod API and serverless worker SDK.

703K 293 112
Qiskit
qiskit-aer

Aer is a high performance simulator for quantum circuits that includes noise models

674K 658 431
clearml
clearml-agent

ClearML Agent - ML-Ops made easy. ML-Ops scheduler & orchestration solution

579K 294 115
spotty-cloud
spotty

Training deep learning models on AWS and GCP instances

557K 493 43
    • Data from PyPI, GitHub, ClickHouse, and BigQuery