13 dependents
| Package | Description | Downloads/month |
|---|---|---|
| 3M | ||
| A tile level programming language to generate high performance code. | 475K | |
| Fast and memory-efficient exact attention | 406K | |
| Fast and memory-efficient exact attention | 25K | |
| TensorRT LLM provides users with an easy-to-use Python API to define Large Langu... | 16K | |
| CUDA kernel library for Kestrel | 13K | |
| a fast, efficient inference engine for moondream | 11K | |
| Fast and memory-efficient exact attention | 4K | |
| Fast and memory-efficient exact attention | 357 | |
| NVIDIA SOL ExecBench - GPU kernel evaluation framework | 294 | |
| Diligent framework for python | 293 | |
| CUDA kernel library for Kestrel (Jetson PT25 backend) | 64 | |
| CUDA kernel library for Kestrel (Jetson PT24 backend) | 55 |