275 dependents
| Package | Description | Downloads/month |
|---|---|---|
| A framework for efficient model inference with omni-modality models | 477K | |
| Your AI second brain. Self-hostable. Get answers from the web or your docs. Buil... | 61K | |
| Transcription, forced alignment, and audio indexing with OpenAI's Whisper | 55K | |
| Multilingual Automatic Speech Recognition with word-level timestamps and confide... | 49K | |
| A batteries-included bridge between your abstract_* ecosystem and popular Huggin... | 38K | |
| A nearly-live implementation of OpenAI's Whisper. | 20K | |
| Generate lyric translations and transcriptions from Spotify URLs using OpenAI's ... | 18K | |
| Manipulate data with code that is less a golden retriever, and more a Samurai's ... | 15K | |
| Python Speech Language Sample Analysis | 10K | |
| Python Speech Language Sample Analysis | 8K | |
| Interface for OuteTTS models. | 7K | |
| 7K | ||
| FreeGenius AI, an advanced AI assistant that is capable of engaging in conversat... | 7K | |
| ✅A Lightweight Video RAG Framework for Multimodal Reasoning | 5K | |
| GailBot API | 5K | |
| 3K | ||
| Speechlib is a library that unifies speaker diarization, transcription and speak... | 3K | |
| General Utilities | 3K | |
| A stream-translator fork with VAD based audio slicing & GPT / Gemini translation... | 3K | |
| Vox box | 3K | |
| Evidence-first social-media research CLI + Claude Code skill | 3K | |
| An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo ... | 2K | |
| Confidential AI deployment with secure enclaves :lock: | 2K | |
| Automatically transcribe WhatsApp voice notes to your clipboard using OpenAI Whi... | 2K | |
| 🎥 Turn one long video into 10 viral clips – 10x faster! 🚀 Make your content shar... | 2K | |
| 2K | ||
| Extract frames and transcripts from video files for LLM context and multimodal p... | 2K | |
| Speech to Text (s2t): Record audio, run Whisper, export formats, and copy transc... | 2K | |
| Intergrating Arclight with Digital Content, IIIF, and ArchivesSpace | 2K | |
| Watch a source folder and automatically transcode audio files to multiple format... | 2K | |
| Comprehensive STT and TTS Voice Engine for ONYX platform | 2K | |
| A simple tool to transcribe audio files | 2K | |
| A simple GUI to process a small number of audio files using OpenAI's Whisper mod... | 1K | |
| Whisper for your microphone | 1K | |
| minimalist ai agent | 1K | |
| Enter text using your voice. | 1K | |
| Preprocess audio data | 1K | |
| VoXtream is a Full-Stream Zero-shot TTS model with Extremely Low Latency and Spe... | 1K | |
| Transcribe and/ot translate all soundfiles in a folder using Whisper | 1K | |
| Artemis CLI - AI-powered research and development tool | 1K | |
| Push-to-talk voice conversations powered by Whisper and Claude Code | 1K | |
| HomeSetup - AskAI | 1K | |
| An easy-to-use library and command-line tool for TTS | 1K | |
| A simple tool to make the video, audio, subtitle and video-url (especially youtu... | 1K | |
| 方便的工具 | 999 | |
| Add language model support to HF Transformers' Whisper models | 980 | |
| Data preparation system to build controllable AI system | 929 | |
| Cued Speech Processing Tools - Decode and Generate cued speech videos | 924 | |
| A tool module to help you do marketing | 907 | |
| WhisperFlow: Real-Time Transcription Powered by OpenAI Whisper | 878 |