Asr Python Packages | PyPI Stats

youtube-transcript-api

This is a python API which allows you to get the transcript/subtitles for a given YouTube video. It also works for automatically generated subtitles and it does not require an API key nor a headless browser, like other selenium based solutions do!

25.6M 7K 764

deepgram-sdk

Official Python SDK for Deepgram.

2.1M 424 127

whisperx

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

1.1M 22K 2K

nemo-toolkit

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

798K 17K 3K

vosk

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

458K 15K 2K

whisper-normalizer

A python package for whisper normalizer

454K 76 17

cn2an

📦 快速转化「中文数字」和「阿拉伯数字」～ (最新特性：分数，日期、温度等转化）

263K 759 82

sherpa-onnx

Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V, RK NPU, Axera NPU, Ascend NPU, x86_64 servers, websocket server/client, support 12 programming languages

242K 12K 1K

speechmatics-rt

Python SDKs for Speechmatics APIs

144K 18 7

sherpa-onnx-core

144K 12K 1K

speechmatics-voice

Python SDKs for Speechmatics APIs

103K 18 7

sherpa-onnx-bin

56K 12K 1K

whisper-timestamped

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

49K 3K 210

speechmatics-batch

Python SDKs for Speechmatics APIs

29K 18 7

voice-mode

Natural (2-way) voice conversations with Claude Code

26K 1K 155

onnx-asr

A lightweight Python package for Automatic Speech Recognition using ONNX models

21K 311 30

sherpa-ncnn

Real-time speech recognition and voice activity detection (VAD) using next-gen Kaldi with ncnn without Internet connection. Support iOS, Android, Linux, macOS, Windows, Raspberry Pi, VisionFive2, LicheePi4A etc.

17K 2K 213

werpy

🐍📦 Ultra-fast Python package for calculating and analyzing the Word Error Rate (WER). Built for the scalable evaluation of speech and transcription accuracy.

15K 26 6