984 dependents
Package Description Downloads/month
Qwen3-VL is the multimodal large language model series developed by Qwen team, A... 528K
Easy to use stem (e.g. instrumental/vocals) separation from CLI or as a python p... 364K
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrai... 357K
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and p... 292K
Pytorch implementation of the CREPE pitch tracker 273K
Qwen-TTS python package 209K
coqui-ai tts
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and p... 202K
A Python library for audio data augmentation. Useful for making audio ML models ... 178K
Contrastive Language-Audio Pretraining 175K
Open Audio Watermarking Tool 157K
Qwen-ASR python package 127K
VoxCPM2: Tokenizer-Free TTS for Multilingual Speech Generation, Creative Voice D... 122K
Audiocraft is a library for audio processing and generation with deep learning. ... 122K
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech wi... 104K
High-Quality Voice Cloning TTS for 600+ Languages 103K
SoTA open-source TTS 98K
Open source library for running inference workload with Hugging Face Deep Learni... 89K
Basic Pitch, a lightweight yet powerful audio-to-MIDI converter with pitch bend ... 63K
56K
The python library for real-time communication 50K
TensorFlow examples 46K
NeMo Retriever Library is a scalable, performance-oriented document content and ... 31K
End-to-End Speech Processing Toolkit 30K
py-webrtcvad wrapper for trimming speech clips 28K
This code is to run the WARP-Q speech quality metric. 26K
endoreg-db 25K
A high-performance API server that provides OpenAI-compatible endpoints for MLX ... 22K
A nearly-live implementation of OpenAI's Whisper. 20K
Use machine learning to create art and music 20K
AI-based Audio Watermarking Tool 18K
data represent, processing 17K
CLAP (Contrastive Language-Audio Pretraining) is a model that learns acoustic co... 17K
Command line utility for forced alignment using Kaldi 15K
NeuTTS - a package for text-to-speech generation using Neuphonic's TTS models. 14K
TensorFlow examples 13K
Easy to use audio stem separation with a UI, using various models from UVR train... 12K
Python implementation of an ASR service 12K
An implementation of the Nvidia's Parakeet models for Apple Silicon using MLX. 11K
so-vits-svc fork with realtime support, improved interface and more features. 11K
Phenomological Adaptive STochastic auditory nerve fiber model 10K
Integrate librosa, whisper with LLMs to analyze music audio. 10K
This is a auto-testing framework of audio functions for Android devices. 10K
Vietnamese TTS with instant voice cloning • On-device • Real-time CPU inference ... 10K
Django Smart Home 9K
libfmp - Python package for teaching and learning Fundamentals of Music Processi... 8K
An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, ... 8K
An STFT/iSTFT for PyTorch. 8K
FlashSR: One-step Versatile Audio Super-resolution via Diffusion Distillation / ... 8K
BeatNet is state-of-the-art (Real-Time) and Offline joint music beat, downbeat, ... 7K
Framework for building deep neural network models for sound, speech, and voice A... 7K