GenAIWiki
Agents

Computer use agent

A computer use agent is an AI agent that can inspect screenshots and control a desktop or browser with mouse and keyboard actions.

Expanded definition

Computer use agents operate graphical interfaces instead of only calling structured APIs. They typically receive screenshots, decide on actions such as click, drag, type, or keyboard shortcut, execute those actions in a sandboxed environment, and repeat until the task is complete. They are powerful for legacy apps and visual workflows, but they create unique risks because webpages, images, and UI text can contain instructions that conflict with the user's goal. Safer deployments use containers or virtual machines, minimal privileges, domain allowlists, sensitive-data restrictions, and human confirmation for consequential actions.

Related terms

Explore adjacent ideas in the knowledge graph.

Related

Comparisons, tools, and models that connect to this idea.