Google announces Gemini 2.5 Computer Use AI model that can control web browsers like humans do

Send Push

08th October, 2025 15:18 IST

Google just rolled out Gemini 2.5 Computer Use, an AI model that can actually click buttons, fill out forms, and scroll through websites just like a person would. Instead of relying on structured APIs to interact with software, this model uses visual understanding to navigate interfaces designed for humans.

This is how Gemini 2.5 Computer Use model works
Built on Gemini 2.5 Pro's visual understanding capabilities, the model operates in a continuous loop. It receives screenshots of the current environment, analyses the user's request along with action history, and generates responses as function calls representing UI actions. The system supports 13 actions including opening browsers, typing text, dragging and dropping elements, and navigating URLs. After each action executes, the model receives a new screenshot to restart the loop until the task completes.

Google demonstrated the model through use cases ranging from managing pet spa appointments across multiple websites to organising digital sticky notes. The model shows particular strength in web browsers and Android mobile interfaces, though it's not yet optimised for desktop operating system control.

Google’s Gemini 2.5 Computer Use beats Claude , ChatGPT on benchmarks
Google claims Gemini 2.5 Computer Use outperforms rivals like Claude and ChatGPT on several web and mobile control benchmarks, while also delivering lower latency. Early testers are already putting it to work. One AI assistant company said it's often 50% faster than competing solutions, while another found it boosted performance by up to 18% on complex data parsing tasks. Google's own payments team uses it to fix broken UI tests, successfully recovering over 60% of failed test runs.

Safety guardrails are in place to mitigate AI risks
Since AI agents controlling computers come with unique risks—including potential misuse and unexpected behavior—Google built safety features directly into the model. Developers can set up controls to prevent the AI from auto-completing high-risk actions like bypassing CAPTCHAs or compromising system security.

The model's available now in preview through Google AI Studio and Vertex AI , and there's a demo on Browserbase where you can watch it tackle tasks like playing 2048 or browsing Hacker News.

Google announces Gemini 2.5 Computer Use AI model that can control web browsers like humans do

You may also like

Prince William and Kate Middleton's non-negotiable parenting rules including strict ban

Himachal CM reviews preparations for Virbhadra Singh statue unveiling ceremony

Juventus fire Chelsea and Arsenal clear £78m transfer message as agent's meeting revealed

'This is not Gujarat': Mamata slams PM Modi over Bengal flood remarks; hits out at BJP's 'tall claims'

University Challenge airs Jilly Cooper tribute hours after death as fans left 'spooked'