Gemini 2.5 update from Google Deepmind

New Gemini 2.5 Features

Native audio output and live API improvements

Today, Live APIs introduce preview versions of audiovisual input and native audio out dialogs, allowing you to directly build conversational experiences with more natural and expressive Gemini.

It also allows users to manipulate tones, accents and speech styles. For example, you can instruct your model to use dramatic voices when telling stories. It also supports the use of the tool and allows you to search for it on your behalf.

You can try out a set of early features including:

An emotional dialogue in which the model detects and responds appropriately to the user’s voice emotions. In ProActual Audio, models can ignore background conversations and know when to respond. The idea in the live API utilizes Gemini’s thinking capabilities to help the model support more complex tasks.

We are also releasing new previews of text-to-speech in 2.5 Pro and 2.5 Flash. These have initial support for multiple speakers, allowing speech from text using two voices via native audio out.

Like native audio dialogs, text-to-speech is expressive and can capture very subtle nuances such as whispers. Works in over 24 languages and seamlessly switch between them.

versatileai

See Full Bio

What's Hot

Introducing Lyria 3.5 to Google Flow Music

How AI is shortening drug discovery timelines in China

Introducing real-time generative simulation to surgical robotics

Introducing Lyria 3.5 to Google Flow Music

How AI is shortening drug discovery timelines in China

Introducing real-time generative simulation to surgical robotics

SenseTime’s Galaxy project aims to scale up domestic AI chips

Harness, scaffolding, and AI agent terminology worth getting right

OpenAI pushes ChatGPT to patient health records

Most Popular

SenseTime’s Galaxy project aims to scale up domestic AI chips

Harness, scaffolding, and AI agent terminology worth getting right

OpenAI pushes ChatGPT to patient health records

Don't Miss

Introducing Lyria 3.5 to Google Flow Music

How AI is shortening drug discovery timelines in China

Introducing real-time generative simulation to surgical robotics

Subscribe to Updates

What's Hot

Gemini 2.5 update from Google Deepmind

New Gemini 2.5 Features

Native audio output and live API improvements

Related Posts