82 dependents
Package Description Downloads/month
Python library for audio and music analysis 9.7M
Open Source framework for voice and multimodal conversational AI 677K
A Python library for audio data augmentation. Useful for making audio ML models ... 178K
Real-time avatar engine — 100+ FPS on CPU. Generate lip-synced video, stream liv... 104K
Open Source framework for voice and multimodal conversational AI 11K
Accurate and general beat tracker 11K
Ubo main app, running on device initialization. A platform for running other app... 8K
Universal local runtime for STT and TTS models 6K
FASR: Fast Automatic Speech Recognition Pipeline 4K
Ultimate RVC 4K
Software Decoder for raw rf captures of laserdisc, vhs and other analog video fo... 4K
Clarity Challenge toolkit - software for building Clarity Challenge systems 3K
A simple yet effective Audio-to-Midi Automatic Piano Transcription system 3K
Visualize and maintain datasets to develop and understand data-driven algorithms... 3K
Use bacpipe to streamline the process of generating embeddings and analysing you... 3K
Global hotkeys to record speech and transcribe directly to your cursor 2K
A Creative Computing Python Library for Interactive Audio Generation and Audio R... 2K
A framework for computer music in python 2K
The Phoneme Discovery Benchmark 2K
DeepFense: A Unified, Modular, and Extensible Framework for Robust Deepfake Audi... 1K
A unified interface to extract hidden representations from speech foundation mod... 1K
Illuminat: Revolutionizing Education through Personalization 1K
Build 📞 Telephonic-Grade Voice AI — 🌐 WebRTC-Ready Framework 1K
GPT-SoVITS ONNX Inference Engine & Model Converter 1K
An easy-to-use library and command-line tool for TTS 1K
Easy Audio Interfaces is a Python library that provides a simple and flexible wa... 1K
Music Modeling Kit 966
OSEkit 889
Tiny macOS dictation tool on your menubar 762
A python forced alignment package 641
VoxCPM TTS model with Apple Neural Engine backend server 641
Metrics to measure the quality of audio 600
This is a library consisting of pre-trained models for the synthesis of Russian ... 532
BackdoorMBTI is an open source project expanding the unimodal backdoor learning ... 531
Score UltraStar karaoke files against vocal audio using Vocaluxe pitch detection... 484
Real-time audio transcription MCP server for Claude Code 483
Unified, inference-only toolkit for MT3 model family (Magenta MT3, MR-MT3, MT3-P... 481
Python Wrapper of Silero VAD 476
Voxtral audio processing and model implementation for Apple Silicon using MLX 438
GPT-SoVITS ONNX Inference Engine & Model Converter 426
Live music performance playback 416
PDF processing pipeline: remove headers/footers, convert to markdown, and genera... 410
山东联通产互AI工具箱 390
Effective evaluations for Text-to-Speech (TTS) systems 390
370
tensorflow generation of SOX-style spectrograms on the GPU 359
FFTrack is a Python-based music recognition tool that allows users to identify s... 341
Librería para detectar emociones en imágenes y audio usando Robobo 321
基于SparkTTS、OrpheusTTS等模型,提供高质量中文语音合成与声音克隆服务。 303
A Python library for applying information theory and AI/ML models to animal comm... 303