59 dependents
| Package | Description | Downloads/month |
|---|---|---|
| py-webrtcvad wrapper for trimming speech clips | 28K | |
| Automagically synchronize subtitles with video. | 13K | |
| GenAI Processors is a lightweight Python library that enables efficient, paralle... | 5K | |
| ASR text preprocessing utility | 4K | |
| Machine Learning Utilities | 4K | |
| Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Stream... | 3K | |
| Core Python utilities for the Snips Manager | 3K | |
| A modular Python library for voice interactions with AI systems | 2K | |
| ovos plugin for voice activity detection using webrtcvad | 2K | |
| A Python package with a built-in web application | 1K | |
| minimalist ai agent | 1K | |
| Transcribe long audio files with ASR or use the streaming interface | 1K | |
| Generate TTS audio samples for training wake word systems | 899 | |
| A collection of basic python modules for spoken natural language processing | 774 | |
| A Python package for recording, transcribing, and converting audio | 732 | |
| A real-time audio transcription and AI interaction tool | 616 | |
| Automatic Speech Analysis for Cognitive Assessment | 599 | |
| Speech-to-Text tool using Whisper, PyAudio, and VAD. | 511 | |
| Core Python utilities for the Snips Manager | 482 | |
| LightWhisperSTT – Fast, lightweight STT using Whisper.cpp | 472 | |
| A modular Python library for voice interactions with AI systems, featuring high-... | 427 | |
| PDF processing pipeline: remove headers/footers, convert to markdown, and genera... | 410 | |
| FindSub is an Application for automatically downloading and ranking subtitles ba... | 348 | |
| A Speech-to-Text toolkit with VAD, punctuation, and emotion classification | 343 | |
| simple to use, pretrained/training-less models for speaker diarization | 319 | |
| Real-time speech-to-text transcription optimized for Apple Silicon | 302 | |
| A Python package with a built-in web application | 300 | |
| An open source implementation of the audio server part of the Hermes protocol | 273 | |
| Voice-powered research assistant for physical books and papers | 271 | |
| a tool for multimedia | 242 | |
| A simple mic utility, for streaming with vad. You can use it, but its not recomm... | 230 | |
| Remote access to your PC microphone & voice detection with Web UI | 221 | |
| 199 | ||
| Indic Conformer ASR Lib | 195 | |
| An open-source dataset for multiple purposes, such as speaker localization/track... | 185 | |
| Provides training, inference and voice conversion recipes for RADTTS and RADTTS+... | 176 | |
| Tools for speech processing, keyword spotting | 170 | |
| Multimodal emotion recognition framework for video analysis | 170 | |
| Whisper 및 ECAPA-TDNN 기반의 실차 화자 식별 및 노이즈 보정 라이브러리 | 147 | |
| To build voice enabled objects/applications with Python and ReSpeaker | 147 | |
| Hotkey-Activated Voice-to-Clipboard Transcriber | 139 | |
| Tactigon Gear SDK to connect to Tactigon Skin wereable platform | 136 | |
| A modular application for audio processing and Finch robot control. | 135 | |
| A pretrained IPA recognizer | 131 | |
| Agent Framework plugin for services using IndusLabs API. | 115 | |
| Dictation for programmers | 105 | |
| Advanced real-time voice processing library using Whisper and Silero models | 105 | |
| A package for verifying the voice of a person | 103 | |
| DYadic Annotation of Naturalistic Audio | 80 | |
| Samvaad is a speech-driven AI tutor that transforms PDFs, articles, and notes in... | 80 |