Google is transferring nearer to its aim of a “universal AI assistant” that may perceive context, plan and take motion.
As we speak at Google I/O, the tech large introduced enhancements to its Gemini 2.5 Flash — it’s now higher throughout almost each dimension, together with benchmarks for reasoning, code and lengthy context — and a pair of.5 Professional, together with an experimental enhanced reasoning mode, ‘Deep Think,’ that permits Professional to think about a number of hypotheses earlier than responding.
“This is our ultimate goal for the Gemini app: An AI that’s personal, proactive and powerful,” Demis Hassabis, CEO of Google DeepMind, mentioned in a press pre-brief.
‘Deep Think’ scores impressively on high benchmarks
Google introduced Gemini 2.5 Professional — what it considers its most clever mannequin but, with a one-million-token context window — in March, and launched its “I/O” coding version earlier this month (with Hassabis calling it “the best coding model we’ve ever built!”).
“We’ve been really impressed by what people have created, from turning sketches into interactive apps to simulating entire cities,” mentioned Hassabis.
He famous that, based mostly on Google’s expertise with AlphaGo, AI mannequin responses enhance once they’re given extra time to assume. This led DeepMind scientists to develop Deep Assume, which makes use of Google’s newest cutting-edge analysis in pondering and reasoning, together with parallel strategies.
Deep Assume has proven spectacular scores on the toughest math and coding benchmarks, together with the 2025 USA Mathematical Olympiad (USAMO). It additionally leads on LiveCodeBench, a troublesome benchmark for competition-level coding, and scores 84.0% on MMMU, which assessments multimodal understanding and reasoning.
Hassabis added, “We’re taking a bit of extra time to conduct more frontier safety evaluations and get further input from safety experts.” (Which means: As for now, it’s out there to trusted testers by way of the API for suggestions earlier than the potential is made extensively out there.)
Total, the brand new 2.5 Professional leads widespread coding leaderboard WebDev Area, with an ELO rating — which measures the relative ability degree of gamers in two-player video games like chess — of 1420 (intermediate to proficient). It additionally leads throughout all classes of the LMArena leaderboard, which evaluates AI based mostly on human choice.
Since its launch, “we’ve been really impressed by what [users have] created, from turning sketches into interactive apps to simulating entire cities,” mentioned Hassabis.
Vital updates to Gemini 2.5 Professional, Flash
Additionally at present, Google introduced an enhanced 2.5 Flash, thought-about its workhorse mannequin designed for velocity, effectivity and low value. 2.5 Flash has been improved throughout the board in benchmarks for reasoning, multimodality, code and lengthy context — Hassabis famous that it’s “second only” to 2.5 Professional on the LMArena leaderboard. The mannequin can be extra environment friendly, utilizing 20 to 30% fewer tokens.
Google is making ultimate changes to 2.5 Flash based mostly on developer suggestions; it’s now out there for preview in Google AI Studio, Vertex AI and within the Gemini app. It will likely be usually out there for manufacturing in early June.
Google is bringing extra capabilities to each Gemini 2.5 Professional and a pair of.5 Flash, together with native audio output to create extra pure conversational experiences, text-to-speech to assist a number of audio system, thought summaries and pondering budgets.
With native audio enter (in preview), customers can steer Gemini’s tone, accent and elegance of talking (assume: directing the mannequin to be melodramatic or maudlin when telling a narrative). Like Venture Mariner, the mannequin can be geared up with software use, permitting it to look on customers’ behalf.
Different experimental early voice options embrace affective dialogue, which provides the mannequin the power to detect emotion in consumer voice and reply appropriately; proactive audio that permits it to tune out background conversations; and pondering within the Stay API to assist extra advanced duties.
New multiple-speaker options in each Professional and Flash assist greater than 24 languages, and the fashions can shortly swap from one dialect to a different. “Text-to-speech is expressive and can capture subtle nuances, such as whispers,” Koray Kavukcuoglu, CTO of Google DeepMind, and Tulsee Doshi, senior director for product administration at Google DeepMind, wrote in a weblog posted at present.
Additional, 2.5 Professional and Flash now embrace thought summaries within the Gemini API and Vertex AI. These “take the model’s raw thoughts and organize them into a clear format with headers, key details, and information about model actions, like when they use tools,” Kavukcuoglu and Doshi clarify. The aim is to supply a extra structured, streamlined format for the mannequin’s pondering course of and provides customers interactions with Gemini which are less complicated to grasp and debug.
Like 2.5 Flash, Professional can be now geared up with ‘thinking budgets,’ which provides builders the power to regulate the variety of tokens a mannequin makes use of to assume earlier than it responds, or, if they like, flip its pondering capabilities off altogether. This functionality will likely be usually out there in coming weeks.
Lastly, Google has added native SDK assist for Mannequin Context Protocol (MCP) definitions within the Gemini API in order that fashions can extra simply combine with open-source instruments.
As Hassabis put it: “We’re living through a remarkable moment in history where AI is making possible an amazing new future. It’s been relentless progress.”
Day by day insights on enterprise use instances with VB Day by day
If you wish to impress your boss, VB Day by day has you lined. We provide the inside scoop on what firms are doing with generative AI, from regulatory shifts to sensible deployments, so you possibly can share insights for max ROI.
An error occured.

