The Language of Audio 📻

AudioLDM2: AI generation model for all audio "types

Hi folks!👋🏻 This is The Prompt! No fishy AI stories here. Just the catch of the day.

Let's reel in the news

FEATURED

AudioLDM 2: The Language of Audio

You’d think that it would be easy to train an AI model that will generate music, sound effects, and speech all at once.

These audio types are actually quite similar. But they actually have lots of biases, so building a general AI was tough.

At least so far.

AudioLDM 2 is the latest work in that field, that can generalize and create different audio files with just one AI model.

How does it work?

The goal is to have a universal representation of audio.

So, they built a framework that can generalize well between different types of audio; they call it the “Language of Audio” or LOA.

It's like it has a universal key to all sounds, and it uses that key to learn about them on its own, without needing anyone to tell it what each sound means.

Resources

They gave us a list of 350 AI-generated audio files, plus you can try to create your own on HuggingFace.

You can also find some examples on their project page.

🚨 What else is going on

📕 Resources

🧰 Tools of the trade

✍🏼 Prompt of the Day

TOOL

Midjourney

PROMPT

breathtaking landscape shot, [LOCATION] --ar 3:2 --style raw

RESULT

Subscribe to keep reading

This content is free, but you must be subscribed to ThePrompt to continue reading.

Already a subscriber?Sign In.Not now