Remixing album covers

🎨 Messing with album art using ControlNet and Stable Diffusion

Aug 06, 2023

Do you recognize something about this album cover?

It follows the visual structure of the famous Pink Floyd album “Dark side of the moon”. As for the content itself, an Egyptian pyramid in a rising sun, I just generated it on my laptop using Stable Diffusion.

Conceptually the process has three stages. Read the initial picture (1. below), then find the edges in the picture (2.) and finally generate a new picture (3.), keeping the structure of the original.

Follow the structure

This scratches an itch I had ever since I started learning about AI-based image processing: how to generate a picture that follows a specific composition? It doesn’t seem possible with any of the major out-of-the box hosted systems out there.

Yes, you can upload a picture of the Pink Floyd album and a picture of some pyramid and camels to Midjourney “blend” mode, but the results are approximative at best.

Midjourney blend mode input:

Midjourney response:

What I want is something precise.

ControlNet

In February 2023 two researchers at Stanford published ControlNet: a method for controlling image generation that guides Stable Diffusion to follow specific visual properties of the desired output: for example, an edge structure we want to see in the resulting picture.

The fruits of their work are very impressive, not least because they trained their model on regular consumer hardware. The resulting model works together with StableDiffusion, guiding the generative model as it computes the resulting image, helping to steer it towards the desired visual structure.

Based on my weekend experimentation, it works like a charm.

ControlNet Canny model running in the A1111 Stable Diffusion environment

Results

Pink Floyd “Dark Side of the moon” + “Egyptian pyramid in a rising sun” :

The Beatles “Abbey Road” + “Four hikers crossing a road in a forest”:

Frank Turner “Be more kind” + “Holding hands in front of a dark starry sky, milky way visible”:

Any suggestions for other covers to cover :) ? I intend to publish a larger collection on the pnote.eu website, suggestions welcome ⬇️

In other news

The image generation community is excited about the latest huge Stable Diffusion model called SDXL. I tried to get it to work on my laptop. Yes, it’s stunning. Yes, it takes 15 minutes to generate a single picture.
New York launched a "Cleaner Cops!" campaign in which members of the NYPD bathed and scrubbed publicly → AI-generated humour
The newsletter keeps growing slowly but steadily. Welcome to the new subscribers 💫!

Postcard from Slovenia

One more picture from the Slovenia trip last weekend: Ljubljana at night, featuring a piece of street art by a French artist Invader.

Have a great week 💫,
– Przemek

pnote