We designed Gemini 2.5 as a family of hybrid inference models that deliver incredible performance while being on the Pareto frontier of cost and speed. Today, we’re taking the next step by making the 2.5 Pro and Flash models generally available as stable versions. And we’re now offering a preview of 2.5 Flash-Lite, our most cost-effective and fastest 2.5 model to date.
2.5 Flash and 2.5 Pro now generally available
Thanks to your feedback, today we’re releasing stable versions of 2.5 Flash and Pro. This allows you to build production applications with confidence. Developers like Spline and Rooms and organizations like Snap and SmartBear have already been using the latest versions in production over the past few weeks.
Introducing Gemini 2.5 Flash-Lite
We also preview the new Gemini 2.5 Flash-Lite, our most cost-effective and fastest 2.5 model to date. You can start building with the preview version today. We welcome your feedback.
2.5 Flash-Lite has higher overall quality than 2.0 Flash-Lite in coding, math, science, reasoning, and multimodal benchmarks. It excels at high-volume, delay-sensitive tasks such as translation and categorization, and has lower latency than 2.0 Flash-Lite and 2.0 Flash over a wide range of prompt samples. Gemini 2.5 has the same features that make Gemini 2.5 useful, including the ability to enable thinking on different budgets, connectivity to tools like Google search and code execution, multimodal input, and a context length of 1 million tokens.
For more information on the 2.5 family of models, see the latest Gemini technical report.

