Feb 07, 2023

Generating music with Artificial Intelligence

Riffusion takes a clever approach to generative music. Using a finetuned version of Stable Diffusion, trained specifically on spectograms of music, they made it possible to generate music using just a prompt.

Getting started

Getting started with Riffusion is very easy, you can:

use the hosted version on Replicate (easiest)
follow the instructions on Github (harder)

I ran Riffusion locally on my M1 Ultra Mac Studio, and song generation took about 10 seconds per 5 second clip (2.48it/s avg.)

Creating music

Using️ Text to Audio you can start riffing away. Take a look at this guitar example:

prompt: a classic guitar song, fingerpicked, no chords
negative prompt: chords

Guitar song made with riffusion

Taking it a step further

Like img2img transformations with Stable Diffusion, we can do audio2audio transformations with Riffusion.

In the example below, I created a small tune on my OP-1 synthesizer, and accompanied it with the following prompt:

Prompt: a jazz song, light drum in background, saxophone
Negative prompt: electronic music

Synthesizer to saxophone jazz song

In short, Riffusion took 'thinking out of the box' to the next level, by using spectrogram images as a base for generating audio. The resulting audio clips are a small glimpse into the generative music future!

Subscribe to my monthly newsletter

No spam, no sharing to third party. Only you and me.