PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Speech Python Packages

Python packages with the GitHub topic speech. Sorted by relevance, with stars and monthly downloads.
huggingface
datasets

πŸ€— The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools

120.3M 21K 3K
pytorch
torchaudio

Data manipulation and transformation for audio signal processing, powered by PyTorch

12.5M 3K 770
modelscope
modelscope

ModelScope: bring the notion of Model-as-a-Service to life.

4.1M 9K 934
pndurette
gtts

Python library and CLI tool to interface with Google Translate's text-to-speech API

1.5M 3K 383
m-bain
whisperx

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

1.1M 22K 2K
snakers4
silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

830K 9K 768
coqui-ai
tts

πŸΈπŸ’¬ - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

204K 45K 6K
eginhard
monotonic-alignment-search

Monotonically align text and speech

195K 4 1
snakers4
silero

Silero Models: pre-trained text-to-speech models made embarrassingly simple

178K 6K 363
OpenBMB
voxcpm

VoxCPM2: Tokenizer-Free TTS for Multilingual Speech Generation, Creative Voice Design, and True-to-Life Cloning

128K 17K 2K
HumeAI
hume

Python client for Hume AI

119K 174 44
Rikorose
deepfilternet

Noise supression using deep filtering

62K 4K 443
linto-ai
whisper-timestamped

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

54K 3K 210
interactiveaudiolab
penn

Pitch Estimating Neural Networks (PENN)

26K 273 26
sensein
senselab

senselab is a Python package that simplifies building pipelines for biometric (e.g. speech, voice, video, etc) analysis.

14K 38 9
ai-bot-pro
achatbot

An open source chat bot architecture for voice/vision (and multimodal) assistants, local(CPU/GPU bound) and remote(I/O bound) to run.

14K 89 18
r9y9
pysptk

A python wrapper for Speech Signal Processing Toolkit (SPTK).

13K 451 80
Sinapsis-AI
sinapsis

Modular and Universal AI

10K 40 11
xinjli
allosaurus

Allosaurus is a pretrained universal phone recognizer for more than 2000 languages

9K 727 101
felixbur
nkululeko

Machine learning audio prediction experiments based on templates

9K 43 12
gregorias
pycodec2

Python's interface to codec 2

9K 24 8
huggingface
nlp

πŸ€— The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools

8K 21K 3K
modelscope
clearvoice

An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.

8K 4K 337
ina-foss
inaspeechsegmenter

CNN-based audio segmentation toolkit. Allows to detect speech, music, noise and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.

8K 886 149
    • Data from PyPI, GitHub, ClickHouse, and BigQuery