21 dependents
| Package | Description | Downloads/month |
|---|---|---|
| All-in-one speech toolkit in pure Python and Pytorch | 1.6M | |
| Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Stream... | 3K | |
| Vox box | 3K | |
| A unified interface to extract hidden representations from speech foundation mod... | 1K | |
| Google EMEA gTech Ads Data Science Team's solution to automatically translate an... | 1K | |
| A python forced alignment package | 641 | |
| Minimal CosyVoice2 European inference CLI (bundles runtime + Matcha) | 467 | |
| Token2Audio Server Package | 403 | |
| Production-ready transcription and diarization pipeline with parallel processing | 286 | |
| Minimal CosyVoice2 French inference CLI (bundles runtime + Matcha) | 237 | |
| A standalone service for transcribing audio files using WhisperX | 224 | |
| cosyvoice package by xp | 144 | |
| StepAudio2: Audio tokenizer and TTS model | 143 | |
| Step-Audio 2 is an end-to-end multi-modal large language model designed for indu... | 109 | |
| VietTTS: An Open-Source Vietnamese Text to Speech | 108 | |
| Spoof-Aware Speaker Verification System | 103 | |
| 79 | ||
| Multi-lingual large voice generation model, providing inference, training and de... | 65 | |
| Step-Audio 2 is an end-to-end multi-modal large language model designed for indu... | 64 | |
| A fundamental toolkit designed for music, song, and audio generation | 61 | |
| Multi-lingual large voice generation model, providing inference, training and de... | 34 |