PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Speech To Text Python Packages

Python packages with the GitHub topic speech-to-text. Sorted by relevance, with stars and monthly downloads.
Uberi
speechrecognition

Speech recognition module for Python, supporting several engines and APIs, online and offline.

7.8M 9K 2K
SYSTRAN
faster-whisper

Faster Whisper transcription with CTranslate2

7.6M 23K 2K
deepgram
deepgram-sdk

Official Python SDK for Deepgram.

2.2M 424 127
m-bain
whisperx

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

1.1M 22K 2K
alphacep
vosk

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

476K 15K 2K
microsoft
foundry-local-sdk

Foundry Local Manager Python SDK: Control-plane SDK for Foundry Local.

329K 2K 307
revdotcom
rev-ai

Rev AI Python SDK

262K 36 13
k2-fsa
sherpa-onnx

Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V, RK NPU, Axera NPU, Ascend NPU, x86_64 servers, websocket server/client, support 12 programming languages

245K 12K 1K
KoljaB
realtimestt

A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcription.

220K 10K 837
snakers4
silero

Silero Models: pre-trained text-to-speech models made embarrassingly simple

178K 6K 363
k2-fsa
sherpa-onnx-core

Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V, RK NPU, Axera NPU, Ascend NPU, x86_64 servers, websocket server/client, support 12 programming languages

148K 12K 1K
Blaizzy
mlx-audio

A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.

87K 7K 578
k2-fsa
sherpa-onnx-bin

Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V, RK NPU, Axera NPU, Ascend NPU, x86_64 servers, websocket server/client, support 12 programming languages

57K 12K 1K
linto-ai
whisper-timestamped

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

54K 3K 210
speechmatics
speechmatics-python

Python library and CLI for Speechmatics

51K 75 23
gradio-app
fastrtc

The python library for real-time communication

51K 5K 430
Softcatala
whisper-ctranslate2

Whisper command line client compatible with original OpenAI client based on CTranslate2.

38K 1K 124
mozilla
deepspeech

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

28K 27K 4K
istupakov
onnx-asr

A lightweight Python package for Automatic Speech Recognition using ONNX models

22K 311 30
Xewdy444
playwright-recaptcha

A Python library for solving reCAPTCHA v2 and v3 with Playwright

20K 529 67
analyticsinmotion
werpy

🐍📦 Ultra-fast Python package for calculating and analyzing the Word Error Rate (WER). Built for the scalable evaluation of speech and transcription accuracy.

15K 26 6
Capsize-Games
airunner

Offline inference engine for art, real-time voice conversations, LLM powered chatbots and automated workflows

14K 1K 97
ARahim3
mlx-tune

Fine-tune LLMs on your Mac with Apple Silicon. SFT, DPO, GRPO, Vision, TTS, STT, Embedding, and OCR fine-tuning — natively on MLX. Unsloth-compatible API.

13K 1K 79
mozilla
deepspeech-gpu

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

11K 27K 4K
    • Data from PyPI, GitHub, ClickHouse, and BigQuery