Speaker Diarization Python Packages

funasr

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

357K 16K 2K

whisper-timestamped

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

49K 3K 210

espnet

End-to-End Speech Processing Toolkit

30K 10K 2K

wespeakerruntime

Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit

10K 1K 192

funasr-onnx

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

7K 16K 2K

diart

A python package to build AI-powered real-time audio applications

6K 2K 161

senko

Very fast, accurate speaker diarization

4K 255 28

spectralcluster

Python re-implementation of the (constrained) spectral clustering algorithms used in Google's speaker diarization papers.

4K 550 75

diarize

Speaker diarization for Python — "who spoke when?" CPU-only, no API keys, Apache 2.0. ~10.8% DER on VoxConverse, 8x faster than real-time.

4K 62 7

speechlib

Speechlib is a library that unifies speaker diarization, transcription and speaker recognition in a single pipeline to create transcripts for audio conversations with actual speaker names and time tags. This library also contains audio preprocessor functions.

3K 258 27

whisper-run

Faster Whisper with Speaker Diarization

755 9 1

funasr-torch

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

718 16K 2K

diarizationlm

This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.

704 449 38

pvfalcon

On-device speaker diarization powered by deep learning

666 71 7

transcribe-critic

Multi-source transcript merging inspired by textual criticism — LLM adjudicates multiple Whisper, YouTube captions & external transcripts for higher quality. Includes speaker diarization and summarization.

621 22 2

sonata-asr

SONATA (SOund and Narrative Advanced Transcription Assistant): An advanced ASR system that captures human expressions including emotive sounds and non-verbal cues.

542 5 2

gryannote

Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.

511 70 8

simpleder

A lightweight library to compute Diarization Error Rate (DER).

486 62 9

sidlingvo

This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.

466 449 38

uisrnn

This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.

429 2K 319

funasr-python

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

356 16K 2K

stjlib

STJLib provides data classes and utilities for working with STJ files, which are used to represent transcribed audio and video data in a structured, machine-readable JSON format.

351 0 0

pvfalcondemo

On-device speaker diarization powered by deep learning

327 71 7

precisetranscribe

Utilities for transcribing audio files using the Whisper API.

324 1 0

Search Packages