Speaker Recognition Python Packages

nemo-toolkit

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

823K 17K 3K

wespeakerruntime

Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit

10K 1K 192

diarize

Speaker diarization for Python — "who spoke when?" CPU-only, no API keys, Apache 2.0. ~10.8% DER on VoxConverse, 8x faster than real-time.

4K 62 7

speechlib

Speechlib is a library that unifies speaker diarization, transcription and speaker recognition in a single pipeline to create transcripts for audio conversations with actual speaker names and time tags. This library also contains audio preprocessor functions.

3K 258 27

pveagle

On-device speaker recognition engine powered by deep learning

1K 42 6

nemo-toolkit-asr

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

1K 17K 3K

pveagledemo

On-device speaker recognition engine powered by deep learning

954 42 6

nemo-asr

Collection of Neural Modules for Speech Recognition

693 17K 3K

pvfalcon

On-device speaker diarization powered by deep learning

686 71 7

diarizationlm

This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.

673 450 38

mvector

Voice Print Recognition toolkit on Pytorch

651 1K 167

ppvector

本项目使用了EcapaTdnn、ResNetSE、ERes2Net、CAM++等多种先进的声纹识别模型，同时本项目也支持了MelSpectrogram、Spectrogram、MFCC、Fbank等多种数据预处理方法

618 312 50

sidlingvo

This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.

471 450 38

uisrnn

This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.

460 2K 319

pvfalcondemo

On-device speaker diarization powered by deep learning

370 71 7

nemo-nlp

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

344 17K 3K

hyperion-ml

Python toolkit for speech processing

296 72 21

audioperm

A python library for generating different permutations of audible segments from audio files.

278 13 2

wavencoder

WavEncoder is a Python library for encoding audio signals, transforms for audio augmentation, and training audio classification models with PyTorch backend.

261 92 14