Google presents Gemini 3, the AI ​​model that continues the challenge against ChatGPT: the features

Google presents Gemini 3, the AI ​​model that continues the challenge against ChatGPT: the features

Credit: Google

Google has officially released Gemini 3an update that marks a new chapter in the competition for AI supremacy, positioned as a direct response to the recent launches of OpenAI and Anthropic. We are talking about the most sophisticated model ever created by the Mountain View laboratories, designed not only to chat in text form with the user but to act as «a real thought partner», to use the expression used by Google in its official press release. The great innovation lies in the diversification of the offer: on the one hand you have Gemini 3 Proalready available, optimized for multimodal understanding and speed; on the other hand it will arrive shortly Gemini 3 Deep Thinka variant with deep reasoning abilities to solve complex scientific and mathematical problems. Let’s take a closer look at them features of Gemini 3.

The characteristics of the AI ​​Gemini 3

Getting to the heart of the technical specifications, you should know that Gemini 3 Pro represents a generational leap compared to the version 2.5surpassing it in every significant metric. The model reached the top of the LMArena ranking with a Elo score of 1501 (a comparative evaluation system based on human preferences), but what is most surprising is his performance on rigorous academic tests. In the benchmark Humanity’s Last Examdesigned to test expert-level reasoning, the model achieved the 37.5% without the aid of external tools, detaching the previous record holders. By translating the benchmark numbers into more concrete aspects, all this means that Google’s new model can handle complex nuances, abandoning the clichéd and flattery-filled answers typical of previous chatbots in favor of a more direct, factual and, if necessary, criticism. Its multimodal nature has been further refined: it does not just read texts, but is also capable of processing video, audio And Images with unprecedented precision, as demonstrated by the score of87.6% on Video-MMMU.

For those looking for even higher performance, there is the mode Deep Think by Gemini 3which will be available to Ultra plan subscribers after further security testing. Designed to “think” before responding, this mode achieved very interesting results in several benchmarks. In the test ARC-AGIwhich evaluates the ability to solve problems never seen before, reaches the 45.1%a value that indicates a capacity for abstraction and generalization that begins to simulate deductive human thinking. Imagine being able to provide the model with entire video lessons or complex academic articles: Deep Think will not simply summarize them, but will be able to generate data visualizations, interactive flashcards or personalized study plans, acting as a sort of dedicated university tutor.

Google introduced the concept of vibe and agentic codinga methodology that allows you to develop software based more on intention and natural description than on rigid syntax, entrusting AI with the task of translating the idea into working code. To support this vision was born Google Antigravityan agent development platform that Google spoke about in these terms:

Google Antigravity transforms AI assistance from a tool in a developer’s toolkit to an active partner. While the heart of Google Antigravity is a familiar AI IDE experience, its agents have been elevated to a dedicated interface and given direct access to the editor, terminal, and browser. Now agents can autonomously plan and execute complex, end-to-end software tasks on your behalf, while validating their own code.

These agentic capabilities, the ability of AI to act as an autonomous entity pursuing a goal, extend well beyond programming. Thanks to better long-term planning, verified by benchmark Vending Bench 2 (a complex resource management simulation), it seems that Gemini 3 can handle complex daily tasks. This means that, just to give a banal example, it will be possible to delegate the management of your own to him email on Gmail, asking him not only to read, but to organize and reply to messages or, another example, to plan travel itineraries complex by crossing data from different sources. The objective is therefore to move the interaction from the insertion of a prompt to the delegation of a complex task.

The question of security and reliability of Google’s “smartest model”.

One aspect that Google has strongly underlined concerns the safety and reliability of the model. In a landscape where AI hallucinations are still a problem, Gemini 3 showed progress in the test SimpleQA Verifiedachieving factual accuracy of 72.1%. Furthermore, the model was trained to better resist “prompt injection”a technique used by cyber criminals to trick AI into performing unexpected actions (an increasingly insidious type of cyber attack), and also to avoid complacencyor the tendency of AI to confirm the user’s opinions even when they are incorrect. Since all the tests we referred to were performed internally by Google, we will need to test the new Gemini 3 in the field to see if the model has been given all the improvements we talked about which, at least on paper, appear to be quite succulent.