58 dependents
| Package | Description | Downloads/month |
|---|---|---|
| Python library for audio and music analysis | 9.7M | |
| Data preparation for speech processing models training. | 688K | |
| Python bindings for Chromaprint acoustic fingerprinting and the Acoustid Web ser... | 70K | |
| A Python API for SuperCollider | 34K | |
| Clarity Challenge toolkit - software for building Clarity Challenge systems | 3K | |
| Use bacpipe to streamline the process of generating embeddings and analysing you... | 3K | |
| Vox box | 3K | |
| Helpful tools to assist in making Synth Riders maps | 3K | |
| A Creative Computing Python Library for Interactive Audio Generation and Audio R... | 2K | |
| Tatt creates a uniform API for multiple speech-to-text (STT) services. | 2K | |
| Transcribe and translate voice into LRC file using Whisper and LLMs (GPT, Claude... | 1K | |
| Creating and managing playlists, and managing the filenames and directory struct... | 1K | |
| Illuminat: Revolutionizing Education through Personalization | 1K | |
| Eco-acoustics data visualization and analysis | 1K | |
| OCEAN-AI | 830 | |
| Python text-to-speech library with built-in voice effects and support for multip... | 738 | |
| A simple library for Fréchet Audio Distance (FAD) calculation | 733 | |
| Loudness added. | 658 | |
| A python forced alignment package | 641 | |
| Reprodutor de áudio para estudo de inglês. | 593 | |
| This is a library consisting of pre-trained models for the synthesis of Russian ... | 532 | |
| BackdoorMBTI is an open source project expanding the unimodal backdoor learning ... | 531 | |
| TTS with RVC pipeline | 485 | |
| A sound-controlled terminal-based rhythm game. | 474 | |
| Utilities | 466 | |
| PDF processing pipeline: remove headers/footers, convert to markdown, and genera... | 410 | |
| A lightweight, offline-first transcription utility for audio and video files wit... | 404 | |
| Music Audio Feature Extractor | 393 | |
| 山东联通产互AI工具箱 | 390 | |
| Powerful, simple, audio tag editor for GNU/Linux | 358 | |
| FFTrack is a Python-based music recognition tool that allows users to identify s... | 341 | |
| Librería para detectar emociones en imágenes y audio usando Robobo | 321 | |
| A Python tool and Beets plugin for music emotion and perceptual feature predicti... | 305 | |
| Production-ready transcription and diarization pipeline with parallel processing | 286 | |
| A standalone service for transcribing audio files using WhisperX | 224 | |
| German Text-To-Speech Engine using Tacotron and Griffin-Lim | 204 | |
| A Python-based tool designed for analyzing and adjusting the tempo of music trac... | 188 | |
| ASRDeepspeech x Sakura-ML (English/Japanese) with deepspeech2 model in pytorch ... | 181 | |
| Provides training, inference and voice conversion recipes for RADTTS and RADTTS+... | 176 | |
| 166 | ||
| 162 | ||
| Framework for training deep automatic speech recognition models. | 159 | |
| The package contains a ready-to-use streamlit widget for downloading or recordin... | 155 | |
| A python program to automatically detect fish sounds in passive acoustic recordi... | 134 | |
| Tools for 3D camera calibration and reconstruction with graphical user interface... | 126 | |
| VoiceStudio: A unified toolkit for text-style prompted speech synthesis, voice a... | 119 | |
| A multi-processing audio check | 104 | |
| Standalone ASR Flask service package for XDP | 100 | |
| 88 | ||
| Production-ready UVR5 CLI & Docker image. Run SOTA separation models (Roformer, ... | 85 |