Meta Unveils AI Tool for Music and Audio Generation from Text Descriptions
Meta has introduced a groundbreaking AI tool named AudioCraft. This innovative tool developed in collaboration with Microsoft is designed to generate high-quality, realistic audio and music from text descriptions.
The introduction of AudioCraft is part of Meta’s ongoing efforts to develop generative AI tools, including those for Instagram. One such tool is designed to detect AI-generated content, a testament to the company’s commitment to advancing AI technology.
AudioCraft promises to revolutionize the process of adding music to content creation. It aims to eliminate the time-consuming task of searching for suitable songs, offering a more streamlined approach for users, particularly small business owners who wish to add soundtracks to their video ads on Instagram.
While AudioCraft is not yet available on Meta’s platforms, the company has made the tool’s code open-source. This strategic move allows researchers and practitioners to train their own models using custom datasets, thereby contributing to the advancement of AI-generated audio and music.
AudioCraft comprises three models: MusicGen, AudioGen, and an improved version of EnCodec. MusicGen specializes in creating music and was trained on a vast dataset of 400,000 music recordings, accompanied by text descriptions and metadata. AudioGen generates lifelike environmental sounds based on written acoustic scene descriptions, while the EnCodec decoder ensures higher-quality music generation with fewer issues.
Despite the excitement surrounding AudioCraft’s launch, Meta acknowledges the importance of responsible innovation. The company recognizes that their training datasets lack diversity, particularly in terms of music styles and language. By sharing the code for AudioCraft, Meta hopes to encourage other researchers to work on reducing biases and potential misuse in generative models.
Meta has already shared hundreds of samples generated by AudioCraft, showcasing a wide range of outputs from 80s disco and jazz instrumentals to people speaking against a backdrop of enthusiastic cheering. The company eagerly anticipates the creative outcomes that individuals will produce using this revolutionary AI tool.