PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Search Packages

Find Python packages by name, description, GitHub topic, or filter by metrics
xlite-dev
ffpa-attn

FFPA: Extend FlashAttention-2 with Split-D, ~O(1) SRAM complexity for large headdim, 1.8x~3x↑🎉 vs SDPA.

1K 277 16
ot-triton-lab
flash-sinkhorn

Sinkhorn optimal transport kernels in PyTorch + Triton (squared Euclidean, no cost matrix materialization).

1K 189 19
HKUSTDial
flash-sparse-attn

Trainable fast and memory-efficient sparse attention

518 637 60
davidkny22
easywheels

Smart GPU wheel installer. Auto-detects CUDA, GPU, torch, and Python.

395 0 0
kyegomez
flashmha

An simple pytorch implementation of Flash MultiHead Attention

386 22 4
Mapika
gpkg

GPU package manager — find prebuilt CUDA wheels, build missing ones

321 0 0
DAMO-NLP-SG
inf-cl

[CVPR 2025 Highlight] The official CLIP training codebase of Inf-CL: "Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss". A super memory-efficiency CLIP training scheme.

266 284 12
erfanzar
jax-flash-attn2

A flexible and efficient implementation of Flash Attention 2.0 for JAX, supporting multiple backends (GPU/TPU/CPU) and platforms (Triton/Pallas/JAX).

154 34 1
egaoharu-kensei
flash-attention-triton

Cross-platform FlashAttention-2 Triton implementation for Turing+ GPUs with custom configuration mode

154 26 0
SmallDoges
flash-dmattn

Flash Dynamic Mask Attention: Fast and Memory-Efficient Trainable Dynamic Mask Sparse Attention

134 594 54
    • Data from PyPI, GitHub, ClickHouse, and BigQuery