Google bakes computer control directly into Gemini 3.5 Flash, letting the model see and operate your screen
the-decoder.com · ai-productivity-automation · AI Tools & Product Updates
Insight summary
•Google integrated 'Computer Use' into Gemini 3.5 Flash to enable the model to see, understand, and interact with computers and devices directly.
•This capability allows agents to operate across browser, mobile, and desktop environments for tasks like software testing and office automation.
•Gemini 3.5 Flash scored 78.4 on the OSWorld benchmark, outperforming earlier Gemini versions and GPT-5.4 mini but slightly behind GPT-5.5 and Anthropic's Opus 4.8.
•To prevent prompt injection attacks, Google employs adversarial training and two enterprise safeguards requiring user confirmation or automatic task stoppage.
•Additional security recommendations include sandboxing, human oversight, and strict access controls, detailed in Google's best practices documentation.
•The new feature is accessible via the Gemini API and the Gemini Enterprise Agent Platform, with a Browserbase demo and GitHub reference implementation provided.
•This integration replaces the previously separate Gemini 2.5 model's computer interaction capabilities.