Metrics for evaluating Automated Audio Captioning systems, designed for PyTorch.
Audio Captioning datasets for PyTorch.
Using pretrained encoder and language models to generate captions from multimedia inputs.
CoNeTTE is an audio captioning system, which generate a short textual description of the sound events in any audio file.