[NeurIPS 2025] PyTorch implementation of [ThinkSound], a unified framework for generating audio from any modality, guided by Chain-of-Thought (CoT) reasoning.
AudioMaker is a Python package for generating seamless, long-form audio from massive text inputs. Unlike traditional TTS tools, AudioMaker can handle book-length content (even 4+ hours) by splitting text into chunks, synthesizing each chunk, and merging them into a single audio file.