Dependents of torchaudio

1,099 dependents

Package	Description	Downloads/month
sglang	SGLang is a high-performance serving framework for large language models and mul...	298.9M
vllm	A high-throughput and memory-efficient inference and serving engine for LLMs	9.2M
pyannote-audio	State-of-the-art speaker diarization toolkit	2.2M
torch-audiomentations	Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for...	1.7M
torch-pitch-shift	Pitch-shift audio clips quickly with PyTorch (CUDA supported)! Additional utilit...	1.7M
speechbrain	All-in-one speech toolkit in pure Python and Pytorch	1.6M
whisperx	WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarizatio...	1.1M
silero-vad	Silero VAD: pre-trained enterprise-grade Voice Activity Detector	857K
openunmix	Open-Unmix - Music Source Separation for PyTorch	496K
mattersim	MatterSim: A deep learning atomistic model across elements, temperatures and pre...	423K
s3tokenizer	Reverse Engineering of Supervised Semantic Speech Tokenizer (S3Tokenizer) propos...	292K
torchcrepe	Pytorch implementation of the CREPE pitch tracker	265K
realtimestt	A robust, efficient, low-latency speech-to-text library with advanced voice acti...	218K
qwen-tts	Qwen-TTS python package	212K
tts	🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and p...	204K
resemble-perth	Open Audio Watermarking Tool	158K
torchfcpe	The official Pytorch implementation of Fast Context-based Pitch Estimation (FCPE...	139K
voxcpm	VoxCPM2: Tokenizer-Free TTS for Multilingual Speech Generation, Creative Voice D...	125K
audiocraft	Audiocraft is a library for audio processing and generation with deep learning. ...	122K
stable-audio-tools	Generative models for conditional audio generation	117K
f5-tts	Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech wi...	106K
omnivoice	High-Quality Voice Cloning TTS for 600+ Languages	105K
chatterbox-tts	SoTA open-source TTS	98K
braindecode	Deep learning software to decode EEG, ECG or MEG signals	66K
stable-ts	Transcription, forced alignment, and audio indexing with OpenAI's Whisper	54K
gamesentenceminer	An immersion toolkit for learning Languages through games and other visual media...	49K
rvc-python	Using RVC via console or python scripts	38K
audiobox-aesthetics	Unified automatic quality assessment for speech, music, and sound.	37K
optimum-rbln	⚡ A seamless integration of HuggingFace Transformers & Diffusers with RBLN SDK f...	30K
vllm-cpu	Wheels & Docker images for running vLLM on CPU-only systems, optimized for diffe...	30K
llamafactory	Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)	29K
penn	Pitch Estimating Neural Networks (PENN)	26K
endoreg-db	endoreg-db	25K
neucodec	A package for NeuCodec, based on xcodec2.	24K
whisper-live	A nearly-live implementation of OpenAI's Whisper.	20K
xtts-api-server	A simple FastAPI Server to run XTTSv2	20K
naeural-core	These modules form the backbone "bare-metal" version of the Naeural Edge Protoco...	20K
uniception	Generalizable Perception Stack for all things 3D, 4D & Scene Understanding	18K
wavmark	AI-based Audio Watermarking Tool	18K
audiolm-pytorch	Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation...	18K
msclap	CLAP (Contrastive Language-Audio Pretraining) is a model that learns acoustic co...	17K
airunner	Offline inference engine for art, real-time voice conversations, LLM powered cha...	14K
beat-this	Accurate and general beat tracker	14K
senselab	senselab is a Python package that simplifies building pipelines for biometric (e...	13K
protenix	Toward High-Accuracy Open-Source Biomolecular Structure Prediction.	13K
caul	Python implementation of an ASR service	12K
hume-tada	Text-Acoustic Dual-Aligned Language Model	12K
so-vits-svc-fork	so-vits-svc fork with realtime support, improved interface and more features.	11K
aps-ai-beamline-controller	AI-driven Beamline Controller	10K
batchalign	Python Speech Language Sample Analysis	10K