45 dependents
| Package | Description | Downloads/month |
|---|---|---|
| Open source library for running inference workload with Hugging Face Deep Learni... | 89K | |
| NeuTTS - a package for text-to-speech generation using Neuphonic's TTS models. | 14K | |
| 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching | 5K | |
| Prosodic: a metrical-phonological parser, written in Python. For English and Fin... | 4K | |
| Universal Deep Learning Inference Engine — execute any AI model without model-sp... | 3K | |
| Implementation of KittenTTS | 3K | |
| Extract phoneme-level timestamps from speeh audio. | 2K | |
| Vision Unlearning: a tool for Machine Unlearning in Computer Vision | 1K | |
| Composable lyric-audio alignment pipeline with staged execution, batch processin... | 717 | |
| mimic tts plugin for OpenVoiceOS | 683 | |
| A multilingual phoneme recognizer capable of generalizing zero-shot to unseen ph... | 656 | |
| This package is written for text-to-audio/music generation. | 648 | |
| X-Voice | 571 | |
| Minimal CosyVoice2 European inference CLI (bundles runtime + Matcha) | 467 | |
| minimal deep learning framework | 420 | |
| Python forced alignment | 383 | |
| Generator and evaluator for speech corpora | 376 | |
| Text-to-speech | 359 | |
| The TTSDS benchmark evaluates synthetic speech quality by considering prosody, s... | 346 | |
| MaskGCT convenient inference wrapper | 317 | |
| Local audiobook generation system using MLX-Audio for Apple Silicon | 308 | |
| VST - Voice Simple Tools | 284 | |
| Text-to-Speech module for Illufly AI | 248 | |
| Minimal CosyVoice2 French inference CLI (bundles runtime + Matcha) | 237 | |
| Grapheme-to-phoneme tool for corpus conversion, where phonemes match Phoible inv... | 230 | |
| VITS toolkit on Pytorch | 222 | |
| Fork of StyleTTS 2 Python packge. StyleTTS 2: Towards Human-Level Text-to-Speech... | 222 | |
| Multilingual phonetic-similarity replacement engine — a proper-noun substitution... | 218 | |
| Better Lyrics Translation Toolkit | 205 | |
| MaskGCT model for TTSDB | 189 | |
| A text to speech plugin based on StyleTTS2. | 146 | |
| A text normalization package for TTS preprocessing with multi-language support | 135 | |
| Hey Buddy is a tool for training wake-word-detecting neural networks for use in ... | 110 | |
| A python package to evaluate pronunciation difficulty of words in multiple langu... | 109 | |
| A serving system for speech language models. | 109 | |
| spoke lang detection framework | 103 | |
| Chiluka - A lightweight TTS inference package based on StyleTTS2 | 99 | |
| VITS toolkit on PaddlePaddle | 94 | |
| TTS with The Massively Multilingual Speech (MMS) project | 91 | |
| Add a short description here | 85 | |
| NeuTTS Air plugin for VOCO audio inference runtime | 65 | |
| Forked from DiffRhythm2: Text-to-Music Generation with Diffusion and Music-Langu... | 62 | |
| OpenSICL package | 56 | |
| glados tts plugin for OpenVoiceOS | 35 | |
| VoiceStar: Robust, Duration-Controllable TTS that can Extrapolate | 29 |