Dependents of vector-quantize-pytorch

56 dependents

Package	Description	Downloads/month
stable-audio-tools	Generative models for conditional audio generation	118K
endoreg-db	endoreg-db	25K
neucodec	A package for NeuCodec, based on xcodec2.	24K
zetascale	Build high-performance AI models with modular building blocks	19K
audiolm-pytorch	Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation...	18K
meshgpt-pytorch	Implementation of MeshGPT, SOTA Mesh generation using Attention, in Pytorch	16K
dalle2-pytorch	Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural netw...	15K
chattts	A generative speech model for daily dialogue.	7K
metacontroller-pytorch	Implementation of the MetaController proposed in "Emergent temporal abstractions...	7K
stylegan2-pytorch	Simplest working implementation of Stylegan2, state of the art generative advers...	6K
simpletuner	A general fine-tuning kit geared toward image/video/audio diffusion models.	6K
magvit2-pytorch	Implementation of MagViT2 Tokenizer in Pytorch	5K
x-transformers-rl	Implementation of a transformer for reinforcement learning using `x-transformers...	4K
improving-transformers-world-model	Implementation of the new SOTA for model based RL, from the paper "Improving Tra...	3K
phenaki-pytorch	Implementation of Phenaki Video, which uses Mask GIT to produce text guided vide...	3K
naturalspeech2-pytorch	Natural Speech 2 - Pytorch	3K
diffsptk	A differentiable version of SPTK	2K
muse-maskgit-pytorch	Implementation of Muse: Text-to-Image Generation via Masked Generative Transform...	2K
xcodec2	A library for XCodec2 model.	1K
genie2-pytorch	Genie2	1K
parti-pytorch	Implementation of Parti, Google's pure attention-based text-to-image neural netw...	1K
colortransferlib	This library provides color and tyle transfer algorithms which were published in...	658
boson-multimodal	Boson Multimodal - A multimodal AI framework	657
ichigo	Ichigo is an open, ongoing research experiment to extend a text-based LLM to hav...	475
audiolm-superfeel	Superfeel adaptation of implementation of SoundStorm, Efficient Parallel Audio G...	472
villa-x	ViLLa-X	422
titok-pytorch	Implementation of TiTok, proposed by Bytedance in "An Image is Worth 32 Tokens f...	403
chattts-fork	ChatTTS is a generative speech model for daily dialogue.	402
physioex	PhysioEx, a PyTorch Lightning based library for Interpretable physiological sign...	396
harmonai-tools	Generative models for conditional audio generation	371
ichigo-whisper	Ichigo Whisper is a compact (22M parameters), open-source speech tokenizer for t...	370
kosmosx	Transformers at zeta scales	369
audiotoken	Audio tokenization, in the fastest way possible!	367
amplify-pytorch	Amplify	360
glaucus	Glaucus is a PyTorch complex-valued ML autoencoder & RF estimation python module...	340
fish-speech-lib	Fish Speech pipeline as library so you don't need to webui.	334
zuna	Foundation model for EEG reconstruction and interpolation	324
thinksound	[NeurIPS 2025] PyTorch implementation of [ThinkSound], a unified framework for g...	310
voco-fishspeech	Fish Speech 1.5 TTS plugin for VOCO audio inference runtime	266
sylber	Sylber: Syllabic Embedding Representation of Speech from Raw Audio	247
best-rq-pytorch	Implementation of BEST-RQ - a model for self-supervised learning of speech signa...	233
ichigo-asr	Ichigo Whisper is a compact (22M parameters), open-source speech tokenizer for t...	208
ace-step15-fork	ACE-Step 1.5	190
prescient-igloo	Tokenizing Loops of Antibodies (arXiv:2509.08707)	186
pat-vla	Progressive Action Tokenizer framework with FSQ quantization	168
maskbit-pytorch	MaskBit	137
streaming-dvae	Streaming DVAE	136
tts-webui-songbloom	SongBloom package for tts-webui	122
voicestudio	VoiceStudio: A unified toolkit for text-style prompted speech synthesis, voice a...	119
soundstream	Implementation of SoundStream, an end-to-end neural audio codec	107