308 dependents
Package Description Downloads/month
Open WebUI 1.3M
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarizatio... 1.1M
A robust, efficient, low-latency speech-to-text library with advanced voice acti... 177K
Sprite AI is an AI companion for your desktop 83K
An immersion toolkit for learning Languages through games and other visual media... 50K
Whisper command line client compatible with original OpenAI client based on CTra... 38K
A nearly-live implementation of OpenAI's Whisper. 20K
Userful tools for linux life 11K
AI meeting assistant for macOS — auto-record, live transcription, Claude-powered... 11K
Super-Brain: Your codebase's working memory. Local graph + vector intelligence f... 10K
A profile-based personal AI agent for terminal, messengers, and private ALP netw... 8K
Fast, multimodal context for agents. 6K
Transcribe your .wav .mp4 .mp3 .flac files to text or record your own audio! 6K
Simultaneous speech-to-text models 6K
Open Vision Agents by Stream. Build voice and vision agents quickly with any mod... 5K
A short description of the package. 5K
Vogent turn keeping plugin for Vision Agents 5K
GailBot API 5K
AUSTR.AI — Lokaler KI-Assistent mit eingebautem Datenschutz. Chat, Anonymisierun... 4K
MCP server for Google search and page fetching using headless Chromium 4K
audia is an agentic Python package that converts PDFs — academic papers, reports... 4K
BanglaSpeech2Text: An open-source offline speech-to-text package for Bangla lang... 3K
3K
Speechlib is a library that unifies speaker diarization, transcription and speak... 3K
Telegram bot for chatting with your folder using LLMs 3K
💬 Fast, cross-platform CLI and GUI for batch transcription, translation, speaker... 3K
An MCP (Model Context Protocol) server that allows you to fetch subtitles for Bi... 3K
Offline voice agent framework for robots. 3K
A stream-translator fork with VAD based audio slicing & GPT / Gemini translation... 3K
Lightweight offline voice assistant for hands-free music control (YouTube Music ... 3K
Vox box 3K
Wyoming protocol server for faster whisper speech to text system 3K
Push-to-talk voice-to-text for Linux. Hold a hotkey, speak, release — text appea... 3K
FastAPI server for OpenAI-compatible audio transcription and translation using f... 3K
Voice-driven AI assistant with OpenAI GPT-4o integration and enhanced dark theme... 3K
Push-to-talk transcription 2K
A modular Python library for voice interactions with AI systems 2K
Use your voice to trigger events and communicate with AI Agents. 2K
Build local, queryable packs from videos, articles, podcasts, and files for MCP ... 2K
The MKV Episode Matcher is a tool for identifying TV series episodes from files 2K
Global hotkeys to record speech and transcribe directly to your cursor 2K
Claude Code for local LLMs. Unified backend, setup, and coding harness for your ... 2K
FasterWhisper for OpenVoiceOS 2K
User-research transcription and quote extraction engine 2K
high quality multi-lingual speech to text 2K
Local meeting recorder with transcription and speaker diarization for Obsidian 2K
An anime girl that lives in your Hyprland and chills. 2K
Hardware-aware, concurrent pipeline for subtitle generation. 2K
Control your entire laptop from Telegram — screenshots, shell, files, processes,... 2K
CLI workflow for AI-assisted livestream clip extraction 2K