🎸 Riffusion (hobby)

⛔ This project is no longer actively maintained.

Riffusion is a library for real-time music and audio generation with stable diffusion.

Read about it athttps:// riffusion /aboutand try it athttps:// riffusion /.

This is the core repository for riffusion image and audio processing code.

Diffusion pipeline that performs prompt interpolation combined with image conditioning
Conversions between spectrogram images and audio clips
Command-line interface for common tasks
Interactive app using streamlit
Flask server to provide model inference via API
Various third party integrations

Related repositories:

Web app:https://github /riffusion/riffusion-app
Model checkpoint:https://huggingface.co/riffusion/riffusion-model-v1

Citation

If you build on this work, please cite it as follows:

@article{Forsgren_Martiros_2022,
author = {Forsgren, Seth* and Martiros, Hayk*},
title = {{Riffusion - Stable diffusion for real-time music generation}},
url = {https://riffusion /about},
year = {2022}
}

Install

Tested in CI with Python 3.9 and 3.10.

It's highly recommended to set up a virtual Python environment withcondaorvirtualenv:

conda create --name riffusion Python =3.9
conda activate riffusion

Install Python dependencies:

Python -m pip install -r requirements.txt

In order to use audio formats other than WAV,ffmpegis required.

sudo apt-get install ffmpeg # linux
brew install ffmpeg # mac
conda install -c conda-forge ffmpeg # conda

If torchaudio has no backend, you may need to installlibsndfile.Seethis issue.

If you have an issue, try upgradingdiffusers.Tested with 0.9 - 0.11.

Guides:

Simple Install Guide for Windows

Backends

CPU

cpuis supported but is quite slow.

CUDA

cudais the recommended and most performant backend.

To use with CUDA, make sure you have torch and torchaudio installed with CUDA support. See the install guideor stable wheels.

To generate audio in real-time, you need a GPU that can run stable diffusion with approximately 50 steps in under five seconds, such as a 3090 or A10G.

Test availability with:

importtorch
torch.cuda.is_available()

MPS

Thempsbackend on Apple Silicon is supported for inference but some operations fall back to CPU, particularly for audio processing. You may need to set PYTORCH_ENABLE_MPS_FALLBACK=1.

In addition, this backend is not deterministic.

Test availability with:

importtorch
torch.backends.mps.is_available()

Command-line interface

Riffusion comes with a command line interface for performing common tasks.

See available commands:

Python -m riffusion.cli -h

Get help for a specific command:

Python -m riffusion.cli image-to-audio -h

Execute:

Python -m riffusion.cli image-to-audio --image spectrogram_image.png --audio clip.wav

Riffusion Playground

Riffusion contains astreamlitapp for interactive use and exploration.

Run with:

Python -m riffusion.streamlit.playground

And access athttp://127.0.0.1:8501/

Run the model server

Riffusion can be run as a flask server that provides inference via API. This server enables theweb appto run locally.

Run with:

Python -m riffusion.server --host 127.0.0.1 --port 3013

You can specify--checkpointwith your own directory or huggingface ID in diffusers format.

Use the--deviceargument to specify the torch device to use.

The model endpoint is now available athttp://127.0.0.1:3013/run_inferencevia POST request.

Example input (seeInferenceInputfor the API):

{
"Alpha": 0.75,
"num_inference_steps": 50,
"seed_image_id": "og_beat",

"start": {
"prompt": "church bells on sunday",
"seed": 42,
"denoising": 0.75,
"guidance": 7.0
},

"end": {
"prompt": "jazz with piano",
"seed": 123,
"denoising": 0.75,
"guidance": 7.0
}
}

Example output (seeInferenceOutputfor the API):

{
"image": "< base64 encoded JPEG image >",
"audio": "< base64 encoded MP3 clip >"
}

Tests

Tests live in thetest/directory and are implemented withunittest.

To run all tests:

Python -m unittest test/*_test.py

To run a single test:

Python -m unittest test.audio_to_image_test

To preserve temporary outputs for debugging, setRIFFUSION_TEST_DEBUG:

RIFFUSION_TEST_DEBUG=1 Python -m unittest test.audio_to_image_test

To run a single test case within a test:

Python -m unittest test.audio_to_image_test -k AudioToImageTest.test_stereo

To run tests using a specific torch device, setRIFFUSION_TEST_DEVICE.Tests should pass with cpu,cuda,andmpsbackends.

Development Guide

Install additional packages for dev withPython -m pip install -r requirements_dev.txt.

Linter:ruff
Formatter:black
Type checker:mypy

These are configured inpyproject.toml.

The results ofmypy.,black.,andruff.mustbe clean to accept a PR.

CI is run through GitHub Actions from.github/workflows/ci.yml.

Contributions are welcome through pull requests.

Name		Name	Last commit message	Last commit date
Latest commit History 130 Commits
.github/workflows		.github/workflows
integrations		integrations
riffusion		riffusion
seed_images		seed_images
test		test
.gitignore		.gitignore
CITATION		CITATION
LICENSE		LICENSE
README.md		README.md
VERSION		VERSION
cog.yaml		cog.yaml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
requirements_all.txt		requirements_all.txt
requirements_dev.txt		requirements_dev.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎸 Riffusion (hobby)

Citation

Install

Backends

CPU

CUDA

MPS

Command-line interface

Riffusion Playground

Run the model server

Tests

Development Guide

About

Releases 1

Contributors 6

Languages

License

riffusion/riffusion-hobby

Folders and files

Latest commit

History

Repository files navigation

🎸 Riffusion (hobby)

Citation

Install

Backends

CPU

CUDA

MPS

Command-line interface

Riffusion Playground

Run the model server

Tests

Development Guide

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Contributors 6

Languages