1,220 dependents
| Package | Description | Downloads/month |
|---|---|---|
| SGLang is a high-performance serving framework for large language models and mul... | 287.7M | |
| Python library for audio and music analysis | 9.7M | |
| All-in-one speech toolkit in pure Python and Pytorch | 1.6M | |
| Open WebUI | 1.3M | |
| aider is AI pair programming in your terminal | 864K | |
| Data preparation for speech processing models training. | 688K | |
| A framework for efficient model inference with omni-modality models | 477K | |
| GenAI Perf Analyzer CLI - CLI tool to simplify profiling LLMs and Generative AI ... | 366K | |
| Easy to use stem (e.g. instrumental/vocals) separation from CLI or as a python p... | 364K | |
| A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrai... | 357K | |
| 🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and p... | 292K | |
| AIPerf is a package for performance testing of AI models | 285K | |
| Native Python WFDB package | 212K | |
| Qwen-TTS python package | 209K | |
| 🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and p... | 202K | |
| A robust, efficient, low-latency speech-to-text library with advanced voice acti... | 177K | |
| Contrastive Language-Audio Pretraining | 175K | |
| Open Audio Watermarking Tool | 157K | |
| 🐸 - A general purpose model trainer, as flexible as it gets | 135K | |
| Qwen-ASR python package | 127K | |
| VoxCPM2: Tokenizer-Free TTS for Multilingual Speech Generation, Creative Voice D... | 122K | |
| Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech wi... | 104K | |
| Real-time avatar engine — 100+ FPS on CPU. Generate lip-synced video, stream liv... | 104K | |
| High-Quality Voice Cloning TTS for 600+ Languages | 103K | |
| Sprite AI is an AI companion for your desktop | 83K | |
| Handling audio files in Python | 72K | |
| A streaming audio reader, processor, and writer built on top of soundfile, and P... | 66K | |
| Dreadnode Strikes SDK | 59K | |
| 56K | ||
| An immersion toolkit for learning Languages through games and other visual media... | 50K | |
| python wrapper for rubberband | 45K | |
| Using RVC via console or python scripts | 39K | |
| vMLX - Home of JANG_Q - Cont Batch, Prefix, Paged, KV Cache Quant, VL - Powers M... | 36K | |
| End-to-End Speech Processing Toolkit | 30K | |
| Not1MM != N1MM, An amateur radio contest logger for Linux. | 25K | |
| endoreg-db | 25K | |
| For running psychology and neuroscience experiments | 22K | |
| AgentMake AI: a kit for developing agentic AI applications that support 24 AI ba... | 22K | |
| A nearly-live implementation of OpenAI's Whisper. | 20K | |
| Pluto ML - Machine Learning Operations Framework | 18K | |
| Text-to-Speech with Multiple Backend Fallback (elevenlabs → luxtts → gtts → pytt... | 17K | |
| Minimax MCP Server | 17K | |
| RESTAI, so many 'A's and 'I's, so little time... | 17K | |
| Chat application with multi-agents system supports multi-models and MCP | 16K | |
| ElevenLabs MCP Server | 16K | |
| TensorRT LLM provides users with an easy-to-use Python API to define Large Langu... | 16K | |
| Real-time text-to-speech with Qwen3-TTS | 15K | |
| Mistral Voxtral STT/TTS adapter for Vox | 15K | |
| A runtime library for OctoAI. | 15K | |
| OmniVAD — Cross-platform Voice Activity Detection and Audio Event Detection (bas... | 14K |