Dependents of librosa

984 dependents

Package	Description	Downloads/month
qwen-omni-utils	Qwen3-VL is the multimodal large language model series developed by Qwen team, A...	528K
audio-separator	Easy to use stem (e.g. instrumental/vocals) separation from CLI or as a python p...	364K
funasr	A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrai...	357K
coqui-tts	🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and p...	292K
torchcrepe	Pytorch implementation of the CREPE pitch tracker	273K
qwen-tts	Qwen-TTS python package	209K
tts	🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and p...	202K
audiomentations	A Python library for audio data augmentation. Useful for making audio ML models ...	178K
laion-clap	Contrastive Language-Audio Pretraining	175K
resemble-perth	Open Audio Watermarking Tool	157K
qwen-asr	Qwen-ASR python package	127K
voxcpm	VoxCPM2: Tokenizer-Free TTS for Multilingual Speech Generation, Creative Voice D...	122K
audiocraft	Audiocraft is a library for audio processing and generation with deep learning. ...	122K
f5-tts	Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech wi...	104K
omnivoice	High-Quality Voice Cloning TTS for 600+ Languages	103K
chatterbox-tts	SoTA open-source TTS	98K
sagemaker-huggingface-inference-toolkit	Open source library for running inference workload with Hugging Face Deep Learni...	89K
basic-pitch	Basic Pitch, a lightweight yet powerful audio-to-MIDI converter with pitch bend ...	63K
bigvgan		56K
fastrtc	The python library for real-time communication	50K
tflite-model-maker-nightly	TensorFlow examples	46K
nv-ingest	NeMo Retriever Library is a scalable, performance-oriented document content and ...	31K
espnet	End-to-End Speech Processing Toolkit	30K
pyvad	py-webrtcvad wrapper for trimming speech clips	28K
warpq	This code is to run the WARP-Q speech quality metric.	26K
endoreg-db	endoreg-db	25K
mlx-openai-server	A high-performance API server that provides OpenAI-compatible endpoints for MLX ...	22K
whisper-live	A nearly-live implementation of OpenAI's Whisper.	20K
note-seq	Use machine learning to create art and music	20K
wavmark	AI-based Audio Watermarking Tool	18K
osc-data	data represent, processing	17K
msclap	CLAP (Contrastive Language-Audio Pretraining) is a model that learns acoustic co...	17K
montreal-forced-aligner	Command line utility for forced alignment using Kaldi	15K
neutts	NeuTTS - a package for text-to-speech generation using Neuphonic's TTS models.	14K
tflite-model-maker	TensorFlow examples	13K
poluvr	Easy to use audio stem separation with a UI, using various models from UVR train...	12K
caul	Python implementation of an ASR service	12K
parakeet-mlx	An implementation of the Nvidia's Parakeet models for Apple Silicon using MLX.	11K
so-vits-svc-fork	so-vits-svc fork with realtime support, improved interface and more features.	11K
phastc	Phenomological Adaptive STochastic auditory nerve fiber model	10K
mcp-music-analysis	Integrate librosa, whisper with LLMs to analyze music audio.	10K
python-audio-autotest-3-10	This is a auto-testing framework of audio functions for Android devices.	10K
vieneu	Vietnamese TTS with instant voice cloning • On-device • Real-time CPU inference ...	10K
simo	Django Smart Home	9K
libfmp	libfmp - Python package for teaching and learning Fundamentals of Music Processi...	8K
clearvoice	An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, ...	8K
torch-stft	An STFT/iSTFT for PyTorch.	8K
polflashsr	FlashSR: One-step Versatile Audio Super-resolution via Diffusion Distillation / ...	8K
beatnet	BeatNet is state-of-the-art (Real-Time) and Offline joint music beat, downbeat, ...	7K
sonusai	Framework for building deep neural network models for sound, speech, and voice A...	7K