59 dependents
Package Description Downloads/month
py-webrtcvad wrapper for trimming speech clips 28K
Automagically synchronize subtitles with video. 13K
GenAI Processors is a lightweight Python library that enables efficient, paralle... 5K
ASR text preprocessing utility 4K
dnn
Machine Learning Utilities 4K
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Stream... 3K
Core Python utilities for the Snips Manager 3K
A modular Python library for voice interactions with AI systems 2K
ovos plugin for voice activity detection using webrtcvad 2K
A Python package with a built-in web application 1K
minimalist ai agent 1K
Transcribe long audio files with ASR or use the streaming interface 1K
Generate TTS audio samples for training wake word systems 899
A collection of basic python modules for spoken natural language processing 774
A Python package for recording, transcribing, and converting audio 732
A real-time audio transcription and AI interaction tool 616
Automatic Speech Analysis for Cognitive Assessment 599
Speech-to-Text tool using Whisper, PyAudio, and VAD. 511
Core Python utilities for the Snips Manager 482
LightWhisperSTT – Fast, lightweight STT using Whisper.cpp 472
A modular Python library for voice interactions with AI systems, featuring high-... 427
PDF processing pipeline: remove headers/footers, convert to markdown, and genera... 410
FindSub is an Application for automatically downloading and ranking subtitles ba... 348
A Speech-to-Text toolkit with VAD, punctuation, and emotion classification 343
simple to use, pretrained/training-less models for speaker diarization 319
Real-time speech-to-text transcription optimized for Apple Silicon 302
A Python package with a built-in web application 300
An open source implementation of the audio server part of the Hermes protocol 273
Voice-powered research assistant for physical books and papers 271
a tool for multimedia 242
A simple mic utility, for streaming with vad. You can use it, but its not recomm... 230
Remote access to your PC microphone & voice detection with Web UI 221
199
Indic Conformer ASR Lib 195
An open-source dataset for multiple purposes, such as speaker localization/track... 185
Provides training, inference and voice conversion recipes for RADTTS and RADTTS+... 176
Tools for speech processing, keyword spotting 170
Multimodal emotion recognition framework for video analysis 170
Whisper 및 ECAPA-TDNN 기반의 실차 화자 식별 및 노이즈 보정 라이브러리 147
To build voice enabled objects/applications with Python and ReSpeaker 147
Hotkey-Activated Voice-to-Clipboard Transcriber 139
Tactigon Gear SDK to connect to Tactigon Skin wereable platform 136
A modular application for audio processing and Finch robot control. 135
A pretrained IPA recognizer 131
Agent Framework plugin for services using IndusLabs API. 115
Dictation for programmers 105
Advanced real-time voice processing library using Whisper and Silero models 105
A package for verifying the voice of a person 103
DYadic Annotation of Naturalistic Audio 80
Samvaad is a speech-driven AI tutor that transforms PDFs, articles, and notes in... 80