110 dependents
| Package | Description | Downloads/month |
|---|---|---|
| Open Source framework for voice and multimodal conversational AI | 677K | |
| Easy to use stem (e.g. instrumental/vocals) separation from CLI or as a python p... | 364K | |
| Pytorch implementation of the CREPE pitch tracker | 273K | |
| Basic Pitch, a lightweight yet powerful audio-to-MIDI converter with pitch bend ... | 63K | |
| AI-based Audio Watermarking Tool | 18K | |
| Converts text to speech in realtime | 12K | |
| Easy to use audio stem separation with a UI, using various models from UVR train... | 12K | |
| Open Source framework for voice and multimodal conversational AI | 11K | |
| Turi Create simplifies the development of custom machine learning models. | 11K | |
| CREPE: A Convolutional REpresentation for Pitch Estimation -- pre-trained model ... | 8K | |
| Spatial Audio Python Package | 7K | |
| Fish Speech | 6K | |
| AI powered speech denoising and enhancement | 6K | |
| VideoSDK Agent Framework plugin for RNNoise. | 6K | |
| Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷 | 4K | |
| Clarity Challenge toolkit - software for building Clarity Challenge systems | 3K | |
| Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Stream... | 3K | |
| Audio processing, data structure, and package management tools. | 3K | |
| A lightweight library for Frechet Audio Distance calculation. | 3K | |
| Easy tools for RVC Inference | 3K | |
| Complete python wrapper for the elevenlabs API | 3K | |
| Синтез речи | 2K | |
| SoTA open-source TTS | 2K | |
| Bithuman Runtime library | 2K | |
| Vogent Turn: fast, open-source turn-detection for Voice AI applications | 2K | |
| BirdNET analyzer for scientific audio data processing. | 1K | |
| 基于PaddlePaddle实现端到端中文语音识别,从入门到实战,超简单的入门案例,超实用的企业项目。支持当前最流行的DeepSpeech2、Conformer... | 1K | |
| Build 📞 Telephonic-Grade Voice AI — 🌐 WebRTC-Ready Framework | 1K | |
| collection of pitch detection algorithms with unified interface | 1K | |
| Pytorch实现的流式与非流式的自动语音识别框架,同时兼容在线和离线识别,目前支持Conformer、Squeezeformer、DeepSpeech2模型,... | 956 | |
| Python library for converting EEG datasets of people with epilepsy to BIDS compa... | 937 | |
| Repo associated to the DESED dataset, download and creation of data | 862 | |
| Toolbox for generating and working with audio signals | 847 | |
| Python wrapper for inference with rvc | 769 | |
| A simple library for Fréchet Audio Distance (FAD) calculation | 733 | |
| The Pytorch implementation of sound classification supports EcapaTdnn, PANNS, TD... | 724 | |
| Simplifying audio and deep learning with PyTorch. | 686 | |
| 🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT ... | 608 | |
| 本项目使用了EcapaTdnn、ResNetSE、ERes2Net、CAM++等多种先进的声纹识别模型,同时本项目也支持了MelSpectrogram、Spec... | 599 | |
| Automatic Speech Analysis for Cognitive Assessment | 599 | |
| 基于PaddlePaddle实现的音频分类,支持EcapaTdnn、PANNS、TDNN、Res2Net、ResNetSE等各种模型,还有多种预处理方法 | 587 | |
| Voice Print Recognition toolkit on Pytorch | 583 | |
| A Python package built for health researchers, transforms raw movement data into... | 559 | |
| Python的音频工具 | 551 | |
| BackdoorMBTI is an open source project expanding the unimodal backdoor learning ... | 531 | |
| A toolbox to decode raw time-domain EEG using features. | 523 | |
| Streaming and Fine-tuning for Chatterbox TTS | 511 | |
| Library to handle signal data and perform signal processing computations | 508 | |
| TTS with RVC pipeline | 485 | |
| faunanet - A bioacoustics platform for the analysis of animal sounds with neural... | 452 |