275 dependents
Package Description Downloads/month
A framework for efficient model inference with omni-modality models 477K
Your AI second brain. Self-hostable. Get answers from the web or your docs. Buil... 61K
Transcription, forced alignment, and audio indexing with OpenAI's Whisper 55K
Multilingual Automatic Speech Recognition with word-level timestamps and confide... 49K
A batteries-included bridge between your abstract_* ecosystem and popular Huggin... 38K
A nearly-live implementation of OpenAI's Whisper. 20K
Generate lyric translations and transcriptions from Spotify URLs using OpenAI's ... 18K
Manipulate data with code that is less a golden retriever, and more a Samurai's ... 15K
Python Speech Language Sample Analysis 10K
Python Speech Language Sample Analysis 8K
Interface for OuteTTS models. 7K
7K
FreeGenius AI, an advanced AI assistant that is capable of engaging in conversat... 7K
✅A Lightweight Video RAG Framework for Multimodal Reasoning 5K
GailBot API 5K
3K
Speechlib is a library that unifies speaker diarization, transcription and speak... 3K
General Utilities 3K
A stream-translator fork with VAD based audio slicing & GPT / Gemini translation... 3K
Vox box 3K
Evidence-first social-media research CLI + Claude Code skill 3K
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo ... 2K
Confidential AI deployment with secure enclaves :lock: 2K
Automatically transcribe WhatsApp voice notes to your clipboard using OpenAI Whi... 2K
🎥 Turn one long video into 10 viral clips – 10x faster! 🚀 Make your content shar... 2K
2K
Extract frames and transcripts from video files for LLM context and multimodal p... 2K
s2t
Speech to Text (s2t): Record audio, run Whisper, export formats, and copy transc... 2K
Intergrating Arclight with Digital Content, IIIF, and ArchivesSpace 2K
Watch a source folder and automatically transcode audio files to multiple format... 2K
Comprehensive STT and TTS Voice Engine for ONYX platform 2K
A simple tool to transcribe audio files 2K
A simple GUI to process a small number of audio files using OpenAI's Whisper mod... 1K
Whisper for your microphone 1K
minimalist ai agent 1K
Enter text using your voice. 1K
Preprocess audio data 1K
VoXtream is a Full-Stream Zero-shot TTS model with Extremely Low Latency and Spe... 1K
Transcribe and/ot translate all soundfiles in a folder using Whisper 1K
Artemis CLI - AI-powered research and development tool 1K
Push-to-talk voice conversations powered by Whisper and Claude Code 1K
HomeSetup - AskAI 1K
An easy-to-use library and command-line tool for TTS 1K
A simple tool to make the video, audio, subtitle and video-url (especially youtu... 1K
方便的工具 999
Add language model support to HF Transformers' Whisper models 980
Data preparation system to build controllable AI system 929
Cued Speech Processing Tools - Decode and Generate cued speech videos 924
A tool module to help you do marketing 907
WhisperFlow: Real-Time Transcription Powered by OpenAI Whisper 878