In December, we first introduced native image output from Gemini 2.0 Flash to a reliable tester. Today, we are making developer experiments available in all regions currently supported by Google AI Studio. You can test this new feature using an experimental version of Gemini 2.0 Flash (Gemini-2.0-Flash-Exp) via Google AI Studio and Gemini APIs.
Gemini 2.0 Flash combines multimodal input, enhanced inference, and natural language understanding to create images.
Here is an example where the 2.0 Flash multimodal output shines:
1. Text and image together
Telling a story using Gemini 2.0 Flash will explain it in photos and keep your characters and settings consistent. Give feedback and the model tells the story. Alternatively, change the style of the drawing.
Sorry, the browser does not support playing this video
Google AI Studio stories and illustration generation
2. Conversational image editing
Gemini 2.0 Flash helps you edit images throughout many turns of natural language interaction. Perfect for repeating the perfect image or exploring various ideas together.
Sorry, the browser does not support playing this video
Multi-turn Conversation Image Editing Maintaining Context Throughout the Google AI Studio Conversation
3. Understanding the world
Unlike many other image generation models, Gemini 2.0 Flash leverages world knowledge and enhances inference to create the right images. This makes it perfect for creating realistic, detailed images that explain the recipe. Like all language models, they aim for accuracy, but their knowledge is broad and general, not absolute or complete.
Sorry, the browser does not support playing this video
Interleaved text and image output for Google AI Studio recipes
4. Text Rendering
Most image generation models struggle to accurately render long sequences of text, and often result in poorly formatted or unreadable characters, or mistakes. The internal benchmark shows that 2.0 flash has more powerful rendering compared to the main competitive models, making it ideal for creating ads, social posts, or invitations.
Sorry, the browser does not support playing this video
Image output with long text rendering in Google AI Studio
Start creating images with Gemini starting today
Get started with Gemini 2.0 Flash via the Gemini API. Read more about image generation in the document.
from Google.Genai Import kinds
client = genai.client(API_KEY=“gemini_api_key”))
response = client.Model.Generate_Content(
Model=“Gemini-2.0-Flash-Exp”,
content=(
“Generate a story about a cute baby turtle in 3D digital art style.”
“For each scene, we generate an image.”
),,
config=kinds.GenerateContentConfig(
Response_Modalities=(“Sentence”, “image”))
),,
))
Whether building AI agents, developing apps with beautiful visuals like Illustrated Interactive Stories, brainstorming visual ideas for conversations, and more, Gemini 2.0 Flash lets you add text and image generation in a single model. I want to see what developers create with native image output. Feedback helps you complete the ready-to-use version.