Back to insights

Published on 6/25/2026

Google bakes computer control directly into Gemini 3.5 Flash, letting the model see and operate your screen

the-decoder.com · ai-productivity-automation · AI Tools & Product Updates

Google bakes computer control directly into Gemini 3.5 Flash, letting the model see and operate your screen

Insight summary

  • Google integrated 'Computer Use' into Gemini 3.5 Flash to enable the model to see, understand, and interact with computers and devices directly.
  • This capability allows agents to operate across browser, mobile, and desktop environments for tasks like software testing and office automation.
  • Gemini 3.5 Flash scored 78.4 on the OSWorld benchmark, outperforming earlier Gemini versions and GPT-5.4 mini but slightly behind GPT-5.5 and Anthropic's Opus 4.8.
  • To prevent prompt injection attacks, Google employs adversarial training and two enterprise safeguards requiring user confirmation or automatic task stoppage.
  • Additional security recommendations include sandboxing, human oversight, and strict access controls, detailed in Google's best practices documentation.
  • The new feature is accessible via the Gemini API and the Gemini Enterprise Agent Platform, with a Browserbase demo and GitHub reference implementation provided.
  • This integration replaces the previously separate Gemini 2.5 model's computer interaction capabilities.

Content details

Industry
ai-productivity-automation
Topic
AI Tools & Product Updates
Source
the-decoder.com
Language
en
View source
Google bakes computer control directly into Gemini 3.5 Flash, letting the model see and operate your screen | Sperto