1.5 Flash is great for summarizing, chat applications, captioning images and videos, extracting data from long documents and tables, and more. This is because the 1.5 Pro is trained through a process called “distillation.” This process transfers the most important knowledge and skills from larger models to smaller, more efficient models.
To learn more about 1.5 Flash, see the latest Gemini 1.5 technical report on the Gemini Technology page and check 1.5 Flash availability and pricing.
Significant improvements in 1.5 Pro
Over the past few months, we’ve made significant improvements to the 1.5 Pro, which is the best model for general performance across a wide range of tasks.
In addition to expanding the context window to 2 million tokens, we have also enhanced code generation, logical reasoning and planning, multi-turn conversation, and speech and image understanding through data and algorithmic advances. We see significant improvements in public and internal benchmarks for each of these tasks.
1.5 Pro can now follow increasingly complex and nuanced instructions, including those that specify product-level behavior, including roles, formats, and styles. Improved control over model responses for specific use cases, such as creating chat agent personas and response styles, and automating workflows with multiple function calls. We also enabled users to control model behavior by setting system instructions.
We added audio understanding to the Gemini API and Google AI Studio, so 1.5 Pro can now understand images and audio in videos uploaded to Google AI Studio. And now we’re integrating 1.5 Pro into Google products and Workspace apps, including Gemini Advanced.
To learn more about 1.5 Pro, check out the latest Gemini 1.5 technical report and the Gemini technology page.
Gemini Nano understands multimodal input
Gemini Nano expands beyond text-only input to include images as well. Starting with Pixel, applications using the multimodality Gemini Nano will be able to understand the world the same way humans do, through visual, auditory, and spoken words, not just text.
Read more about Gemini 1.0 Nano on Android.