CoNeTTE is an audio captioning system, which generate a short textual description of the sound events in any audio file.