PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Speaker Diarization Python Packages

Python packages with the GitHub topic speaker-diarization. Sorted by relevance, with stars and monthly downloads.
alibaba-damo-academy
funasr

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

363K 16K 2K
linto-ai
whisper-timestamped

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

54K 3K 210
espnet
espnet

End-to-End Speech Processing Toolkit

29K 10K 2K
wenet-e2e
wespeakerruntime

Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit

10K 1K 192
alibaba-damo-academy
funasr-onnx

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

7K 16K 2K
juanmc2005
diart

A python package to build AI-powered real-time audio applications

6K 2K 161
narcotic-sh
senko

Very fast, accurate speaker diarization

4K 255 28
FoxNoseTech
diarize

Speaker diarization for Python — "who spoke when?" CPU-only, no API keys, Apache 2.0. ~10.8% DER on VoxConverse, 8x faster than real-time.

4K 62 7
wq2012
spectralcluster

Python re-implementation of the (constrained) spectral clustering algorithms used in Google's speaker diarization papers.

4K 550 75
NavodPeiris
speechlib

Speechlib is a library that unifies speaker diarization, transcription and speaker recognition in a single pipeline to create transcripts for audio conversations with actual speaker names and time tags. This library also contains audio preprocessor functions.

3K 258 27
gorkemkaramolla
whisper-run

Faster Whisper with Speaker Diarization

829 9 1
thc1006
taiwan-asr-toolkit

Production-grade Traditional Chinese / Taiwan Mandarin speech-to-text. Qwen3-ASR + MediaTek Breeze-ASR-25, hot-word injection, LLM polish, speaker diarization. RTF up to 1554x on RTX 5090, 56 TDD tests.

759 1 0
alibaba-damo-academy
funasr-torch

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

742 16K 2K
Picovoice
pvfalcon

On-device speaker diarization powered by deep learning

686 71 7
google
diarizationlm

This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.

673 450 38
ringger
transcribe-critic

Multi-source transcript merging inspired by textual criticism — LLM adjudicates multiple Whisper, YouTube captions & external transcripts for higher quality. Includes speaker diarization and summarization.

584 22 2
hwk06023
sonata-asr

SONATA (SOund and Narrative Advanced Transcription Assistant): An advanced ASR system that captures human expressions including emotive sounds and non-verbal cues.

542 5 2
clement-pages
gryannote

Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.

514 70 8
wq2012
simpleder

A lightweight library to compute Diarization Error Rate (DER).

486 62 9
google
sidlingvo

This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.

471 450 38
google
uisrnn

This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.

460 2K 319
Picovoice
pvfalcondemo

On-device speaker diarization powered by deep learning

370 71 7
alibaba-damo-academy
funasr-python

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

355 16K 2K
yaniv-golan
stjlib

STJLib provides data classes and utilities for working with STJ files, which are used to represent transcribed audio and video data in a structured, machine-readable JSON format.

344 0 0
    • Data from PyPI, GitHub, ClickHouse, and BigQuery