OpenAI Operator

Openai Lancia Operator, the agent who can use browser in your place: what does he do and how to use it

Openai operator. Credit: Openai.

Openaithe parent company of Chatgpt and Sora, in the past few hours has announced Operatorhis first agent Ai capable of perform actions directly on the web instead of the userhow to book flights and trips or shopping on our own. This experimental system represents a step forward in the transformation of artificial intelligence from “simple” assistant to a real independent agent who responds to text prompts. Unlike the classic chatbots that are limited generate content behind the inputs given by the user, in fact, operator can actively interact with the browser: navigate between the pages, compile modules, click on the buttons and flows the contents, just as a human user would do (or almost). Everything to lighten the load of repetitive activities and improve the efficiency of online work. At the moment operator is Available only in the United States For those who have a plan to chatgpt pro active.

The moment we are beating this news, Operator is available for preview for Pro users in the United States (For the record, this plan costs 200 dollars a month). In the future it should also be extended to Plus, Team and Enterprise users. During his presentation, the CEO of Openai Sam Altmanpromised:

Operator will soon be in other countries. Europe, unfortunately, will take some time.

How Openai operator works: the characteristics

Operator It is based on an advanced model called Cua (Computer-ausing agent), which combines the visual skills of GPT-4O with a sophisticated system that Openai defines «reinforced learning». This allows the agent to recognize the graphic interfaces of the websites and interact independently with buttons, menus and various clickable elements without using additional components. In this regard, Openai explains:

Operator can “see” (through Screenshot) and “interact” (using all the actions allowed by mouse and keyboard) with a browser, allowing him to act on the web without requesting personalized API additions.

About the CUA model, this has been trained in such a way that ask the user’s confirmed before completing activities Which can have a concretely impactful output, which could be the sending of an e-mail, the sending of an order, and so on. This means that the user can cheer the work of the model before it actually becomes operational.

But Openai warns that the CUA model is not perfect and, therefore, “He does not expect him to work reliably in all scenarios, for now». Operator is, in fact, still in an embryonic phase and this is why it can make mistakes, even coarse. When this happens, according to what Openai said in the presentation, he can appeal to his reasoning capacity for self -portrait.

What the agent can do to the operators

THE’Utility of operator It can be potentially infinite, at least in a future perspective. Among the many things that can potentially do is theAutomization of activities related to travel organization, booking of restaurants, online shopping, compilation of modules, and the like.

How to use operator

THE’use of operator It is simple and intuitive. It is enough to describe the activity that is intended to have operator and the latter will take care to complete it. That’s it’s. As already mentioned above, it will be possible to take control of the remote browser in use by operator at any time. Not only that: it will be the same operator to request the user’s intervention for the entry of login data, payment data and resolution of captch codes.

In explaining the operation of operator, Openii said:

Users can customize their operator work flows by adding personalized instructions, for all sites or for some specifics, such as setting preferences for airlines on Booking.com. Operator allows users to save prompts for quick access to the home page, ideal for repeated activities such as refueling food on Instacart. Similarly to the use of multiple cards on a browser, users can have more activities perform operators at the same time By creating new conversations, how to order a personalized enamel cup on Etsy during booking a campsite on hipcamp.

If you want to see operator at work, you can take a look at this short video.

https://www.youtube.com/watch?v=gyqs-wukzsm

How sure the agent Ai is and what are risks and limits

But let’s get to the “Security” chapter now. For Ensure adequate safetyOpenai has implemented Three levels of protection. First of all, the agent is scheduled to request confirmation before performing delicate shares, such as those relating to the sending of payments or the insertions of credentials. Secondly, users can disable the use of their data for training AI and delete chronologies and disconnect from the sites with a single click. Third, Openii has developed an advanced monitoring system to identify any IT threats, protecting operators from attempts to manipulate by malicious sites.

From the reading of the previous paragraphs, it is clear that operator is still a “prototype” product and therefore it is not surprising that it has some limitations. Openai herself admits that I operate “It is not able to reliably manage many complex or specialized activities, such as the creation of detailed presentations, the management of complex calendar systems or interaction with highly personalized or non -standard web interfaces».

For safety issues, moreover, Openii deliberately limited the operator’s range of action, always asking for the Supervision of the human user For all those activities judged as sensitive, such as theAdding payment informationof which operator does not collect screenshots. Even when using the e -mail Operator requires the active participation of the user, so that this may possibly identify and correct possible errors. In this phase, moreover, always for safety issues, Operator cannot send e-mails and delete calendar events.

And since it can also stop completelywhen this happens “passes the ball” to the user, returning the control of the operations to be completed.