Environmental sound generation

AudioGen: Textually-Guided Audio Generation

With AudioGen, we have showcased our ability to train AI models to excel in the task of text-to-audio generation. AudioGen is an auto-regressive transformer language model that is specifically designed to produce high-quality audio based on textual descriptions. By providing a textual description of an acoustic scene, our model can generate realistic environmental sounds that accurately reflect the description, incorporating intricate recording conditions and complex contextual details.

Text-to-sound generation

AudioGen is trained for the task of text-to-sound generation. Given a text prompt, it generates 5 seconds of audio adhering to the provided text description.

Whistling with wind blowing

Sirens and a humming engine approach and pass

A duck quacking as birds chirp and a pigeon cooing

Railroad crossing signal followed by a train passing and blowing horn

Typing on a typewriter

🎵 Audiocraft Online

🆓 Free Ai Music Generator Online

Try Now