AI Art Generation has evolved at a wild pace, and Google has thrown another big contender into the mix through Gemini Flash 2.0. You can play new image creation tools in Google’s AI Studio.
Gemini Flash, as the name suggests, is extremely fast and more prominent than the Dall-E 3 and other image creators. That speed may mean low quality images, but not here, especially as there are all the changes and upgrades in the image production capacity of the model. Still, if you really need good results, you need to know how to talk to AI. After a lot of trial and error, I’ve put together five tips for taking out the absolute best art from Gemini Flash 2.0. Some of these may seem similar to advice on other AI art creators.
Please talk
The most interesting new feature of Gemini Flash image creation is that it is not only suitable for one-off illustrations, but also helps you create a visual story by generating a series of related images with consistent styles, settings and moods.
To get started, you need to ask it to tell the story. The results include images that accompany the text.
In my project, I asked AI to “create a story of a heroic baby dragon who protected the fairy queen from evil wizards in a 3D cartoon animation style. For each scene, we generate an image.” I saw the above appearing. Also, if there is a problem, you can rewrite any of the stories and the model plays the image accordingly.
Very specifically
If you tell Gemini to make a “dog in the park,” you might find a blurry golden retriever sitting vaguely in the green. But when you say, “In autumn, a fluffy golden retriever sitting on a wooden bench in Central Park, with red and orange leaves scattered on the ground,” you get exactly what you’re drawing.
AI models thrive in detail. The more you provide it, the better the images will be. Therefore, for the image above, we not only asked for futuristic cities, but also for “retro cityscapes at sunset, neon signs glowing in pink and blue, flying cars in the sky, people walking in retro-future style costumes.” Seven seconds later, the results were created.
Get a conversation
One of my favorites about the new Gemini Flash is that you can have conversations without losing much of your speed. This means that you don’t have to get everything right at once. After generating the image, you can literally chat with AI for editing. Do you want to change the colour? Do you want to add text? Do you want the lighting to be in a mood? Please listen.
In the image set above, I started after sought “a cozy reading corner with a fireplace, a bookcase filled with novels, and a large comfortable armchair.” He then asked to “do it at night with soft, warm lighting,” then “add a sleeping cat to the armchair,” and refined it by asking the AI to “give the room a vintage Victorian aesthetic.” The final result on the left looks pretty much the same as what I imagined, and Gemini feels like an art assistant.
Gemini Flash matches ChatGpt
Google boasts that Gemini is full of real-world knowledge. This means that you can get historical accuracy, realistic cultural details, and substantial imagery if you look for it. Of course, it needs to be specific. For example, if you’re looking for “Viking Warrior,” you might get something that looks like a Game of Thrones character. But you can get more accurate when you say, “I wear a 9th-century historically accurate Viking Warrior, detailed chain mail armor, round wooden shields and a traditional Scandinavian helmet.”
As a test, I asked the AI to “make an ancient Mayan city at sunrise. It’s not perfect, but it looks more authentic than the previous version.
Writing quickly
Most AI image models have long struggled with rendering text, turning words into unreadable graffiti. Even today’s better models take a little time to do it and can take several times to get it right. However, Gemini Flash is surprisingly good at integrating text into images quickly, easy to read. Being very specific is helpful.
That’s how I generated the image above by asking AI to “make a vintage style travel poster “visiting London” with bold retrotypography featuring stylized illustrations of the city.”