PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Search Packages

Find Python packages by name, description, GitHub topic, or filter by metrics
snakers4
silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

872K 9K 768
microsoft
torchscale

Foundation Architecture for (M)LLMs

80K 3K 225
linto-ai
whisper-timestamped

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

49K 3K 210
sutariyaraj
indic-num2words

Python library for converting numbers to words for all Indian Languages.

23K 36 14
SuperKogito
spafe

:sound: spafe: Simplified Python Audio Features Extraction

16K 483 78
r9y9
pysptk

A python wrapper for Speech Signal Processing Toolkit (SPTK).

13K 451 80
lars76
swift-f0

Fast and accurate fundamental frequency (F0) detector using convolutional neural networks

10K 154 20
resemble-ai
resemble-enhance

AI powered speech denoising and enhancement

6K 2K 273
daanzu
silero-vad-lite

Lightweight wrapper for Silero VAD using internal ONNX Runtime and with no python package dependencies

5K 16 1
haoheliu
voicefixer

General Speech Restoration

4K 1K 157
FoxNoseTech
diarize

Speaker diarization for Python — "who spoke when?" CPU-only, no API keys, Apache 2.0. ~10.8% DER on VoxConverse, 8x faster than real-time.

4K 62 7
r9y9
nnmnkwii

Library to build speech synthesis systems designed for easy and fast prototyping.

3K 399 71
vocalpy
vak

A neural network framework for researchers studying acoustic communication

2K 91 17
tabahi
bournemouth-forced-aligner

Extract phoneme-level timestamps from speeh audio.

2K 130 14
magcil
deepaudio-x

A python library to train Deep Neural Networks on various audio tasks using Self-Supervised backbones.

2K 27 0
Ilyushin
signal-transformation

Widely used signal transformation using TensorFlow API.

2K 1 0
EveryVoiceTTS
everyvoice

The EveryVoice TTS Toolkit - Text To Speech for your language

2K 43 4
MontrealCorpusTools
polyglotdb

PolyglotDB is a package for phonetic corpus storage and analysis

1K 51 17
takenori-y
lfeats

A unified interface to extract hidden representations from speech foundation models

1K 1 0
tann9949
vistec-ser

Speech Emotion Recognition using PyTorch sponsored by AIS and VISTEC-DEPA AIResearch Institute Thailand.

930 3 2
kahne
fastwer

A PyPI package for fast word/character error rate (WER/CER) calculation

839 70 16
alessandroragano
scoreq

SCOREQ: Speech COntrastive REgression for Quality Assessment (NeurIPS 2024)

806 108 8
MarkParker5
stark-engine

S.T.A.R.K - Speech and Text Algorithmic Recognition Kit. Modern framework for creating powerfull voice assistants.

736 63 4
Zer0pa
zpe-prosody

ZPE-Prosody V0.0: DETERMINISTIC SPEECH PROSODY CODEC: Intonation | Rhythm | Stress | Emotional Contour | Pitch Transport | Duration Encoding

728 1 0
    • Data from PyPI, GitHub, ClickHouse, and BigQuery