Transcription Python Packages

speechmatics-python

Python library and CLI for Speechmatics

50K 75 23

speach

🐍🍑 Python 3 library for managing, annotating, and converting natural language corpuses using popular formats (CoNLL, ELAN, Praat, CSV, JSON, SQLite, VTT, Audacity, TTL, TIG, ISF, etc.)

29K 21 6

spotify-translator

Generate lyric translations and transcriptions from Spotify URLs using OpenAI's Whisper model.

18K 0 0

pvcheetah

On-device streaming speech-to-text engine powered by deep learning

6K 662 77

diart

A python package to build AI-powered real-time audio applications

6K 2K 161

pvleopard

On-device speech-to-text engine powered by deep learning

5K 481 29

subaligner

Automatically synchronize and translate subtitles, or create new ones by transcribing, using pre-trained DNNs, Forced Alignments and Transformers. https://subaligner.readthedocs.io/

3K 506 24

transkun

A simple yet effective Audio-to-Midi Automatic Piano Transcription system

3K 335 33

speechlib

Speechlib is a library that unifies speaker diarization, transcription and speaker recognition in a single pipeline to create transcripts for audio conversations with actual speaker names and time tags. This library also contains audio preprocessor functions.

3K 258 27

deepctl

Official Deepgram CLI — speech-to-text, text-to-speech, and audio intelligence from your terminal

3K 3 2

lyrics-transcriber

Automatically create synchronised lyrics files in ASS and LRC with word-level timestamps, using Whisper and lyrics from online sources, with anchor sequences and LLMs to auto-correct transcription

3K 91 17

meeting-hive

Local-first, AI-queryable meeting archive. Ingests from meeting tools, applies personal vocabulary corrections, and stores in portable markdown.

2K 0 0

p3x-meet-assistant

Real-time AI speech-to-text for meetings with GPT-4o Transcribe and GPU speaker diarization

2K 0 0

voiceprocessingtoolkit

The VoiceProcessingToolkit is an all-encompassing suite designed for sophisticated voice detection, wake word recognition, text-to-speech synthesis, and advanced audio processing. It offers intuitive interfaces to streamline the integration of voice processing capabilities into your applications

2K 4 1