Close Menu
Versa AI hub
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools
  • Resources

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

What's Hot

How robust AI governance protects enterprise interests

April 10, 2026

The need for AI workflows and monitoring for software developers

April 9, 2026

Workplace learning for AI agents

April 8, 2026
Facebook X (Twitter) Instagram
Versa AI hubVersa AI hub
Saturday, April 11
Facebook X (Twitter) Instagram
Login
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools
  • Resources
Versa AI hub
Home»Tools»Start building with Gemini 2.5 Flash
Tools

Start building with Gemini 2.5 Flash

versatileaiBy versatileaiMarch 16, 2026No Comments4 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
#image_title
Share
Facebook Twitter LinkedIn Pinterest Email

Today, we are publishing an early version of Gemini 2.5 Flash in preview via the Gemini API via Google AI Studio and Vertex AI. This new version builds on the popular 2.0 Flash foundation with significant upgrades to inference capabilities while still prioritizing speed and cost. Gemini 2.5 Flash is our first fully hybrid inference model that gives developers the ability to turn thinking on and off. This model also allows developers to set a thinking budget to find the right trade-offs between quality, cost, and delay. Without thinking, developers can maintain the high speed of 2.0 Flash and improve performance.

Our Gemini 2.5 model is a thinking model, allowing you to reason through your thoughts before responding. Instead of producing output immediately, models can perform “thinking” processes to better understand prompts, break down complex tasks, and plan responses. For complex tasks that require multiple reasoning steps (such as solving a math problem or analyzing a research question), the thought process allows the model to arrive at a more accurate and comprehensive answer. In fact, Gemini 2.5 Flash performs second only to 2.5 Pro at hard prompts in LMArena.

2.5 Flash has metrics comparable to other leading models at a fraction of the cost and size.

The most cost-effective thinking model

2.5 Flash continues to lead as the model with the best price/performance ratio.

Chart showing Gemini 2.5 flash price and performance comparison

Gemini 2.5 Flash adds a new model to Google’s Pareto cost and quality frontier. *

Granular controls to manage your thoughts

We know that different use cases require different quality, cost, and latency tradeoffs. To give developers flexibility, we have enabled the setting of a thought budget that provides fine-grained control over the maximum number of tokens that a model can generate during a thought. As the budget increases, the model can make further inferences to improve quality. Importantly, though, the budget sets an upper bound on how much 2.5 Flash can think about, but the model won’t use the entire budget unless prompted.

The plot graph shows that the quality of inference improves as the thinking budget increases

The quality of reasoning improves as the thinking budget increases.

The model is trained to know how long to think for a given prompt, so it automatically decides how long to think based on the perceived complexity of the task.

2.0 If you want better performance than Flash while minimizing cost and latency, set the thought budget to 0. You can also choose to set a specific token budget for the think phase using parameters or sliders in the Google AI Studio and Vertex AI APIs. For 2.5 Flash, the budget range is 0 to 24576 tokens.

The following prompt shows how much inference is used in 2.5 Flash’s default mode.

Prompts that don’t require much reasoning:

Example 1: “Thank you” in Spanish

Example 2: How many provinces are there in Canada?

Prompts requiring moderate reasoning:

Example 1: Roll two dice. What is the probability that they add up to 7?

Example 2: My gym has basketball pickup hours on MWF from 9:00 PM to 3:00 PM and on Tuesdays and Saturdays from 2:00 PM to 8:00 PM. If you work 5 days a week from 9pm to 6pm and want to play basketball for 5 hours during the week, create a schedule that will make everything work.

Prompts that require advanced reasoning:

Example 1: A cantilever beam of length L=3m has a rectangular cross section (width b=0.1m, height h=0.2m) and is made of steel (E=200 GPa). A uniformly distributed load w=5 kN/m is applied along the entire length and a point load P=10 kN is applied at the free end. Calculate the maximum bending stress (σ_max).

Example 2: Create a function Evaluate_cells(cells: Dict(str, str)) -> Dict(str, float) that calculates the values ​​of cells in a spreadsheet.

Each cell contains:

Or a formula like “=A1 + B1 * 2” using cells like +, -, *, /, etc.

Requirements:

Resolve dependencies between cells. Handles operator precedence (*/ before +-). Detects a cycle and raises ValueError(“Cycle detected at “). There is no eval(). Use only built-in libraries.

Start building with Gemini 2.5 Flash today

Gemini 2.5 Flash with thinking capabilities is now available in preview via the Gemini API in Google AI Studio and Vertex AI, and in a dedicated dropdown in the Gemini app. We encourage you to experiment with the thinking_budget parameter and explore how controllable reasoning can help you solve more complex problems.

from google import genai client = genai.Client(api_key=”GEMINI_API_KEY”) response = client.models.generate_content( model=”gemini-2.5-flash-preview-04-17″, content=”You roll two dice. What is the probability that they add up to 7?”, config=genai.types.GenerateContentConfig( Thinking_config=genai.types.ThinkingConfig( Thinking_budget=1024 ) ) print(response.text)

python

Find a detailed API reference and thought guide in our developer documentation, or get started with code examples in the Gemini Cookbook.

We will continue to improve Gemini 2.5 Flash, with more features coming soon before we begin general availability in a full production environment.

*Model prices are taken from Artificial Analysis and company documentation.

author avatar
versatileai
See Full Bio
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleOpenAI Frontier collides enterprise AI agents with SaaS
Next Article First healthcare robotics dataset and fundamental physical AI model for healthcare robotics
versatileai

Related Posts

Tools

How robust AI governance protects enterprise interests

April 10, 2026
Tools

The need for AI workflows and monitoring for software developers

April 9, 2026
Tools

Workplace learning for AI agents

April 8, 2026
Add A Comment

Comments are closed.

Top Posts

Storage bucket now available on Hug Face Hub

March 30, 20264 Views

The best AI chatbots for customer service in September 2025

September 26, 20254 Views

Prioritized LLMS tuning using a direct priority optimization method

August 5, 20254 Views
Stay In Touch
  • YouTube
  • TikTok
  • Twitter
  • Instagram
  • Threads
Latest Reviews

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

Most Popular

Storage bucket now available on Hug Face Hub

March 30, 20264 Views

The best AI chatbots for customer service in September 2025

September 26, 20254 Views

Prioritized LLMS tuning using a direct priority optimization method

August 5, 20254 Views
Don't Miss

How robust AI governance protects enterprise interests

April 10, 2026

The need for AI workflows and monitoring for software developers

April 9, 2026

Workplace learning for AI agents

April 8, 2026
Service Area
X (Twitter) Instagram YouTube TikTok Threads RSS
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
© 2026 Versa AI Hub. All Rights Reserved.

Type above and press Enter to search. Press Esc to cancel.

Sign In or Register

Welcome Back!

Login to your account below.

Lost password?