1,099 dependents
Package Description Downloads/month
SGLang is a high-performance serving framework for large language models and mul... 298.9M
A high-throughput and memory-efficient inference and serving engine for LLMs 9.2M
State-of-the-art speaker diarization toolkit 2.2M
Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for... 1.7M
Pitch-shift audio clips quickly with PyTorch (CUDA supported)! Additional utilit... 1.7M
All-in-one speech toolkit in pure Python and Pytorch 1.6M
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarizatio... 1.1M
Silero VAD: pre-trained enterprise-grade Voice Activity Detector 857K
Open-Unmix - Music Source Separation for PyTorch 496K
MatterSim: A deep learning atomistic model across elements, temperatures and pre... 423K
Reverse Engineering of Supervised Semantic Speech Tokenizer (S3Tokenizer) propos... 292K
Pytorch implementation of the CREPE pitch tracker 265K
A robust, efficient, low-latency speech-to-text library with advanced voice acti... 218K
Qwen-TTS python package 212K
coqui-ai tts
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and p... 204K
Open Audio Watermarking Tool 158K
The official Pytorch implementation of Fast Context-based Pitch Estimation (FCPE... 139K
VoxCPM2: Tokenizer-Free TTS for Multilingual Speech Generation, Creative Voice D... 125K
Audiocraft is a library for audio processing and generation with deep learning. ... 122K
Generative models for conditional audio generation 117K
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech wi... 106K
High-Quality Voice Cloning TTS for 600+ Languages 105K
SoTA open-source TTS 98K
Deep learning software to decode EEG, ECG or MEG signals 66K
Transcription, forced alignment, and audio indexing with OpenAI's Whisper 54K
An immersion toolkit for learning Languages through games and other visual media... 49K
Using RVC via console or python scripts 38K
Unified automatic quality assessment for speech, music, and sound. 37K
⚡ A seamless integration of HuggingFace Transformers & Diffusers with RBLN SDK f... 30K
Wheels & Docker images for running vLLM on CPU-only systems, optimized for diffe... 30K
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024) 29K
Pitch Estimating Neural Networks (PENN) 26K
endoreg-db 25K
A package for NeuCodec, based on xcodec2. 24K
A nearly-live implementation of OpenAI's Whisper. 20K
A simple FastAPI Server to run XTTSv2 20K
These modules form the backbone "bare-metal" version of the Naeural Edge Protoco... 20K
Generalizable Perception Stack for all things 3D, 4D & Scene Understanding 18K
AI-based Audio Watermarking Tool 18K
Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation... 18K
CLAP (Contrastive Language-Audio Pretraining) is a model that learns acoustic co... 17K
Offline inference engine for art, real-time voice conversations, LLM powered cha... 14K
Accurate and general beat tracker 14K
senselab is a Python package that simplifies building pipelines for biometric (e... 13K
Toward High-Accuracy Open-Source Biomolecular Structure Prediction. 13K
Python implementation of an ASR service 12K
Text-Acoustic Dual-Aligned Language Model 12K
so-vits-svc fork with realtime support, improved interface and more features. 11K
AI-driven Beamline Controller 10K
Python Speech Language Sample Analysis 10K