OpenAI, the developer of ChatGPT, announced Operator. This is a generative AI service that acts like an agent and performs tasks on your behalf. The Operator uses its own browser to display web pages and interact with them by typing, clicking, and scrolling. No input required.
The rollout will be gradual, with ChatGPT Pro subscribers in the United States getting it first.
The Operator can handle a variety of repetitive browser tasks, and OpenAI claims it can fill out forms, order groceries, and even create memes. It also helps businesses because they can use the same interfaces and tools that humans interact with, opening up new opportunities for engagement.
Operator Research Preview. An agent that can use its own browser to perform tasks. pic.twitter.com/wkBBDIlVqj
— OpenAI (@OpenAI) January 23, 2025
Operator features a new model called CUA (Computer-Using Agent). Combines GPT-4o vision capabilities with advanced inference through reinforcement learning. CUAs are trained to work with GUIs, which are graphical user interfaces with buttons, menus, and text fields that appear on the screen.
If the service goes down or you need support, simply transfer control back to you. You will also have to manually enter sensitive data such as passwords and other verification forms.
Operator can integrate with services like Doordash, Etsy, Booking.com, Uber, and Instacart, and conduct research through media partners like The Associated Press and Reuters.
sauce