I draw bad. If my illustrations we’re hand-made by yours truly, they would look something like this:
We don’t want that, so let’s look into AI-based image generation 💫.
The tools of today
I tried three of the top tools in the image generation space, looking for one that would work best to make illustrations for the newsletter:
Dall-E by OpenAI (proprietary/closed),
Stable Diffusion by StabilityAI (open)
Midjourney (proprietary/closed)
Dall-E
Dall-E 2 by OpenAI is the first tool I tried. For this newsletter, I want the pictures to:
match the desired intention (of course)
have a wide 7:4 aspect ratio
have a hand-drawn cyberpunk anime style (just because I like it)
With Dall-E, I found that I get (1.), but the tool insists on the 1:1 aspect ratio (no 2.) and I just couldn’t get it to produce images in the style I was going for (no 3.).
Stable Diffusion
Stable Diffusion is the tool of choice in the open-source community. Of the 3 popular image generation models I looked at, it’s the only one that’s open and available to run for free on your own computer.
There’s also a hosted paid version with a web interface that won my heart. The UI has an explicit choice of style (you can pick Anime and it’ll do anime) and a slider that lets you set the aspect ratio.
I liked the web interface, and the Anime-style illustrations the tool was producing. For the last month, Stability.ai was my chief illustrator.
Midjourney
Midjourney is the tool that gets most of the buzz, thanks to it’s versatility and photo-realistic quality. In 2022, The Economist used Midjourney to create one of the magazine covers. In 2023 it gained notoriety for viral fake photos, including of Pope Francis wearing a puffer jacket.
I avoided it for the longest time because I just couldn’t get over the interface 😅. Midjourney is an image generation tool stuck in a text chat interface. Do I mean it’s like ChatGPT? Not quite, ChatGPT has its own dedicated interface. Midjourney lives as a chatbot on Discord, popular text chat platform for gamers. It’s a tool stuck in another’s tool interface.
What does it mean in practice? To use Midjourney you first register for Discord. You can use the Discord app or the web interface at discord.com. Then you sign up for Midjourney itself, and then you start… chatting.
Everything is chat-based. You set the parameters such as aspect ratio (—ar 7:4) and the model to use (—niji 5) using text parameters. The results are returned as attachment to the text response from the Midjourney bot. If you’d like to remix or upscale one of the 4 images, you click on the suggestion chips (U1, U2,…) → it’s “U” for “upscale”.
If you want to look up the details of how the images were generated, e.g. the random number used to bootstrap generation, you need to know the secret emoji reaction of adding an ✉️ : this makes the bot reply with the seed 🙃.
The artist trapped in a chat interface
Due to the quirks of being a chatbot, I found Midjourney the weirdest to use. But I ended up liking the results the most ❤️. Introducing Midjourney as the new chief illustrator of this newsletter :):
In one of the future posts we’ll take a look inside those AI generation tools and see how they work. If you’re not yet subscribed, I hope you’ll join us below ⬇️
More on this
✨ Midjourney showcase – best images from the Midjourney community. Be ready to be amazed!
🎞️ The text-to-image revolution, explained – Neat video explanation of image generation
Postcard from Paris
The tourism season is on in Paris! Haven’t seen this many boats on the Seine since before Covid.
Have a great week 💫,
– Przemek