PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Search Packages

Find Python packages by name, description, GitHub topic, or filter by metrics
alibaba-damo-academy
funasr

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

357K 16K 2K
linto-ai
whisper-timestamped

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

49K 3K 210
espnet
espnet

End-to-End Speech Processing Toolkit

30K 10K 2K
wenet-e2e
wespeakerruntime

Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit

10K 1K 192
alibaba-damo-academy
funasr-onnx

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

7K 16K 2K
juanmc2005
diart

A python package to build AI-powered real-time audio applications

6K 2K 161
narcotic-sh
senko

Very fast, accurate speaker diarization

4K 255 28
wq2012
spectralcluster

Python re-implementation of the (constrained) spectral clustering algorithms used in Google's speaker diarization papers.

4K 550 75
FoxNoseTech
diarize

Speaker diarization for Python — "who spoke when?" CPU-only, no API keys, Apache 2.0. ~10.8% DER on VoxConverse, 8x faster than real-time.

4K 62 7
NavodPeiris
speechlib

Speechlib is a library that unifies speaker diarization, transcription and speaker recognition in a single pipeline to create transcripts for audio conversations with actual speaker names and time tags. This library also contains audio preprocessor functions.

3K 258 27
gorkemkaramolla
whisper-run

Faster Whisper with Speaker Diarization

755 9 1
alibaba-damo-academy
funasr-torch

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

718 16K 2K
google
diarizationlm

This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.

704 449 38
Picovoice
pvfalcon

On-device speaker diarization powered by deep learning

666 71 7
ringger
transcribe-critic

Multi-source transcript merging inspired by textual criticism — LLM adjudicates multiple Whisper, YouTube captions & external transcripts for higher quality. Includes speaker diarization and summarization.

621 22 2
hwk06023
sonata-asr

SONATA (SOund and Narrative Advanced Transcription Assistant): An advanced ASR system that captures human expressions including emotive sounds and non-verbal cues.

542 5 2
clement-pages
gryannote

Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.

511 70 8
wq2012
simpleder

A lightweight library to compute Diarization Error Rate (DER).

486 62 9
google
sidlingvo

This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.

466 449 38
google
uisrnn

This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.

429 2K 319
alibaba-damo-academy
funasr-python

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

356 16K 2K
yaniv-golan
stjlib

STJLib provides data classes and utilities for working with STJ files, which are used to represent transcribed audio and video data in a structured, machine-readable JSON format.

351 0 0
Picovoice
pvfalcondemo

On-device speaker diarization powered by deep learning

327 71 7
aeronjl
precisetranscribe

Utilities for transcribing audio files using the Whisper API.

324 1 0
    • Data from PyPI, GitHub, ClickHouse, and BigQuery