PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Search Packages

Find Python packages by name, description, GitHub topic, or filter by metrics
jdepoix
youtube-transcript-api

This is a python API which allows you to get the transcript/subtitles for a given YouTube video. It also works for automatically generated subtitles and it does not require an API key nor a headless browser, like other selenium based solutions do!

25.6M 7K 764
deepgram
deepgram-sdk

Official Python SDK for Deepgram.

2.1M 424 127
m-bain
whisperx

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

1.1M 22K 2K
NVIDIA
nemo-toolkit

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

798K 17K 3K
alphacep
vosk

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

458K 15K 2K
kurianbenoy
whisper-normalizer

A python package for whisper normalizer

454K 76 17
Ailln
cn2an

📦 快速转化「中文数字」和「阿拉伯数字」~ (最新特性:分数,日期、温度等转化)

263K 759 82
k2-fsa
sherpa-onnx

Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V, RK NPU, Axera NPU, Ascend NPU, x86_64 servers, websocket server/client, support 12 programming languages

242K 12K 1K
speechmatics
speechmatics-rt

Python SDKs for Speechmatics APIs

144K 18 7
k2-fsa
sherpa-onnx-core

Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V, RK NPU, Axera NPU, Ascend NPU, x86_64 servers, websocket server/client, support 12 programming languages

144K 12K 1K
speechmatics
speechmatics-voice

Python SDKs for Speechmatics APIs

103K 18 7
k2-fsa
sherpa-onnx-bin

Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V, RK NPU, Axera NPU, Ascend NPU, x86_64 servers, websocket server/client, support 12 programming languages

56K 12K 1K
linto-ai
whisper-timestamped

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

49K 3K 210
speechmatics
speechmatics-batch

Python SDKs for Speechmatics APIs

29K 18 7
mbailey
voice-mode

Natural (2-way) voice conversations with Claude Code

26K 1K 155
istupakov
onnx-asr

A lightweight Python package for Automatic Speech Recognition using ONNX models

21K 311 30
k2-fsa
sherpa-ncnn

Real-time speech recognition and voice activity detection (VAD) using next-gen Kaldi with ncnn without Internet connection. Support iOS, Android, Linux, macOS, Windows, Raspberry Pi, VisionFive2, LicheePi4A etc.

17K 2K 213
analyticsinmotion
werpy

🐍📦 Ultra-fast Python package for calculating and analyzing the Word Error Rate (WER). Built for the scalable evaluation of speech and transcription accuracy.

15K 26 6
ai-bot-pro
achatbot

An open source chat bot architecture for voice/vision (and multimodal) assistants, local(CPU/GPU bound) and remote(I/O bound) to run.

14K 89 18
fgnt
meeteval

MeetEval - A meeting transcription evaluation toolkit

11K 154 18
coqui-ai
stt

🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.

9K 3K 300
sccn
eegprep

EEGPrep is an automated preprocessing tool for human EEG data built on a benchmarked EEGLAB pipeline

8K 21 4
Picovoice
pvcheetah

On-device streaming speech-to-text engine powered by deep learning

6K 662 77
pkulijing
whisper-input

已改名为 daobidao,本包仅做转发(shim)。请运行 `pip install daobidao`。

6K 0 0
    • Data from PyPI, GitHub, ClickHouse, and BigQuery