1,220 dependents
Package Description Downloads/month
SGLang is a high-performance serving framework for large language models and mul... 287.7M
Python library for audio and music analysis 9.7M
All-in-one speech toolkit in pure Python and Pytorch 1.6M
Open WebUI 1.3M
aider is AI pair programming in your terminal 864K
Data preparation for speech processing models training. 688K
A framework for efficient model inference with omni-modality models 477K
GenAI Perf Analyzer CLI - CLI tool to simplify profiling LLMs and Generative AI ... 366K
Easy to use stem (e.g. instrumental/vocals) separation from CLI or as a python p... 364K
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrai... 357K
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and p... 292K
AIPerf is a package for performance testing of AI models 285K
Native Python WFDB package 212K
Qwen-TTS python package 209K
coqui-ai tts
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and p... 202K
A robust, efficient, low-latency speech-to-text library with advanced voice acti... 177K
Contrastive Language-Audio Pretraining 175K
Open Audio Watermarking Tool 157K
🐸 - A general purpose model trainer, as flexible as it gets 135K
Qwen-ASR python package 127K
VoxCPM2: Tokenizer-Free TTS for Multilingual Speech Generation, Creative Voice D... 122K
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech wi... 104K
Real-time avatar engine — 100+ FPS on CPU. Generate lip-synced video, stream liv... 104K
High-Quality Voice Cloning TTS for 600+ Languages 103K
Sprite AI is an AI companion for your desktop 83K
Handling audio files in Python 72K
A streaming audio reader, processor, and writer built on top of soundfile, and P... 66K
Dreadnode Strikes SDK 59K
56K
An immersion toolkit for learning Languages through games and other visual media... 50K
python wrapper for rubberband 45K
Using RVC via console or python scripts 39K
vMLX - Home of JANG_Q - Cont Batch, Prefix, Paged, KV Cache Quant, VL - Powers M... 36K
End-to-End Speech Processing Toolkit 30K
Not1MM != N1MM, An amateur radio contest logger for Linux. 25K
endoreg-db 25K
For running psychology and neuroscience experiments 22K
AgentMake AI: a kit for developing agentic AI applications that support 24 AI ba... 22K
A nearly-live implementation of OpenAI's Whisper. 20K
Pluto ML - Machine Learning Operations Framework 18K
Text-to-Speech with Multiple Backend Fallback (elevenlabs → luxtts → gtts → pytt... 17K
Minimax MCP Server 17K
RESTAI, so many 'A's and 'I's, so little time... 17K
Chat application with multi-agents system supports multi-models and MCP 16K
ElevenLabs MCP Server 16K
TensorRT LLM provides users with an easy-to-use Python API to define Large Langu... 16K
Real-time text-to-speech with Qwen3-TTS 15K
Mistral Voxtral STT/TTS adapter for Vox 15K
A runtime library for OctoAI. 15K
OmniVAD — Cross-platform Voice Activity Detection and Audio Event Detection (bas... 14K