80 dependents
| Package | Description | Downloads/month |
|---|---|---|
| WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarizatio... | 1.1M | |
| senselab is a Python package that simplifies building pipelines for biometric (e... | 13K | |
| Python Speech Language Sample Analysis | 10K | |
| Python Speech Language Sample Analysis | 8K | |
| An insanely fast whisper CLI | 7K | |
| A python package to build AI-powered real-time audio applications | 6K | |
| Speechlib is a library that unifies speaker diarization, transcription and speak... | 3K | |
| 💬 Fast, cross-platform CLI and GUI for batch transcription, translation, speaker... | 3K | |
| Time-Accurate Automatic Speech Recognition using Whisper. | 2K | |
| Speech recognition with accurate word-level timestamps. | 2K | |
| Voice Transformation for Videos. 🎤👄🎬 | 1K | |
| Ingest sources with proper citation — PDF, URL, media, Office, DJVU | 1K | |
| Clips AI is an open-source Python library that automatically converts long video... | 1K | |
| Local speaker diarization using MLX Whisper (macOS) or faster-whisper (Linux/CUD... | 1K | |
| Preprocessing and Extraction of Linguistic Information for Computational Analysi... | 1K | |
| Open dubbing is an AI dubbing system which uses machine learning models to autom... | 1K | |
| Psychological and Social Interactions Feature Extraction | 1K | |
| Forced alignment pipeline designed for efficiency and ease of use. | 1K | |
| Google EMEA gTech Ads Data Science Team's solution to automatically translate an... | 1K | |
| Python library for digital measurement of health | 1K | |
| A simple tool to make the video, audio, subtitle and video-url (especially youtu... | 1K | |
| Flavored fork of m-bain/WhisperX for LeGen better experience | 929 | |
| Fast multi-speaker audio/video transcription — faster-whisper + pyannote.audio | 874 | |
| このパッケージはClipsAIの日本語専用フォーク版です。whisperxをfaster-whisperに置き換え、依存関係の問題を解決しています。 | 783 | |
| A voice assistant for the command line | 782 | |
| An agentic framework for building AI agents with LLM integration | 711 | |
| User friendly toolkit for generating immersion language learning tools including... | 704 | |
| Add your description here | 697 | |
| Modern speech recognition with word-level timestamps and speaker diarization. Fo... | 687 | |
| Automatic Speech Analysis for Cognitive Assessment | 599 | |
| Speech-to-text with speaker diarization — Whisper + pyannote.audio, optimized fo... | 552 | |
| SONATA (SOund and Narrative Advanced Transcription Assistant): An advanced ASR s... | 542 | |
| Transcription tool for audio files based on Whisper and Pyannote | 541 | |
| Provide Gradio custom components to make the diarization-based audio labeling pr... | 511 | |
| Go from raw audio to a text-audio dataset with OpenAI's Whisper | 484 | |
| Speechless repo for sales call analysis | 444 | |
| Librairie pour la transcription ASR et la diarisation | 405 | |
| A lightweight, offline-first transcription utility for audio and video files wit... | 404 | |
| A Python package to create closed captions with face detection and recognition. | 399 | |
| Faster Whisper transcription with CTranslate2 | 378 | |
| Speechless repo for sales call analysis | 363 | |
| Voice Acoustic Analyzer - Professional audio metrics extraction | 361 | |
| Offline meeting transcription for macOS — auto-detects meetings, transcribes loc... | 349 | |
| The TTSDS benchmark evaluates synthetic speech quality by considering prosody, s... | 346 | |
| A command-line tool for audio transcription with Whisper and Pyannote. | 344 | |
| Integrated tools to transfer the internet audio to text, extract unpopular views... | 336 | |
| A toolkit for audio transcription, speaker diarization, and text processing | 332 | |
| Detect Speaker Change based on Textual Features via LLMs & Rule-Based NLP and Au... | 330 | |
| audiotool is a DeepLearning utility library. | 330 | |
| A compatibility fix to for whisperx for use with gogadget | 310 |