errori chatgpt

Chatgpt can make mistakes: artificial intelligence errors

The use of Chatgpt as a tool daily It is increasingly widespread. Every day we see born New LLM (Large Language Model) increasingly present and with artificial intelligence technologies that improve day by day. What is important to keep in mind, however, is that artificial intelligence remains anyway different from human intelligence precisely from a substantial point of view, and this because it doesn’t really think, but imitates very well – and always better – our reasoning.

Artificial intelligence is not human intelligence, but it imitates it very well

THE’Artificial intelligence It is, by definition, a technology that tries to attribute to the technologies of the characteristics typically considered human, such as that of reasoning. This often leads us to think that this type of software is actually able to “think with one’s own head”, there are even those who believe they can have emotions! In truth, that’s not the case.

When we talk about reasoning by artificial intelligence, we must keep in mind that in any case it is software that – thanks to certain technologies, such as the Machine Learning – manage to reproduce mechanisms that cop The our reasoning. It may seem like a thin difference, but it is substantial. Let’s take into consideration the LLM, i.e. i Large Language Model as a chatgpt. These are literally “large linguistic models”, that is, software that are trained a manipulate The texts which are provided to him in the training phase. This means that, to answer us, they interpret the words that make up our questions and associate them with the texts we have “taught“, so as to be able to respond in the most coherent way possible. And how do they do? in a nutshell, they are looking for the more likely answer within all the things they have learned during the training. This allows LLM to answer our questions very well, but it may happen that the so -called “hallucinations”. What is it about?

The example of the “r” in Ramarro: because Chatgpt is wrong to position them

To understand it, let’s consider Chatgpt 4o – i.e. the version without reasoning – And ask him: “How many r there are in Ramarro and in what position are they in the word?”. The answer we will get will probably be incorrect (unless you use the O1 version, which uses the COT methodology) because a correct response would need a reasoning capacity that, so far, does not have. We ourselves tried to ask the software, which has so far the number of “r”, but wrong theirs position:

Chatgpt error

There same It happens if we ask the same question to Deepseek, always without activating the option of Reasoning.

However, to obtain a satisfactory response from both chatgpt and DeePseek, we just need to activate – in fact – the reasoning, that is, the “improved” versions of the two LLMs: Chatgpt 01 And Deepseek R1.
Thanks to a methodology called Chain of Tought Cot, These models of reasoning they manage to complete their task correctly without problems. But how do they do?

Because Ai make errors and how the Chain of Thought solves the problem

We try first of all to understand why, without using reasoning, the two models they are wrong. We must keep in mind that LLM, precisely because they don’t really think, They are not really aware of all the letters they are reading, but they think for portions of text calls token. We can imagine them as a sort of “syllable”, which are enlarged or shrunk depending on how precisely the programmers want to be precise or how much calculation power they have available. For this reason, reading i token And Not the individual letters, The two llm do not know what letters are contained in each tokenand for this they are unable to count the number of r In “Ramarro”.

When we activate the model of reasoning Instead, A methodology called is used Chain of Toughtwhich substantially before responding to one request, divides her into underblocchi of minor size before starting to think for token. This allows the model to read some parts of the subtext Which considering the question as a whole, they can escape the software. If, for example, the count of letters in a word, the model recognizes that it is a request that cannot be satisfied through the usual “reasoning” and thus decides to directly use a specially written code to perform a count.

But it is possible “trick“Also the models of reasoning? Yes. We see him through an riddle made to Deepseek R1.

Because Ai make errors and how the Chain of Thought solves the problem

THE’riddle from the lamp and gods Three switches reads like this:

Inside a room there is a shut -off light bulb, while there are three different switches outside the room. The door of the room is closed and there is no way to see if, pressing one of the three switches, the light bulb turns on. What we are sure is that one and only one of the three switches actually lights up the light bulb.

How do we find out which switch is the right one if we can open the door only once and, once opened, can we no longer touch the switches?

Asking this question to Deepseek r1we actually receive the correct response (the answer is in English; if you are curious, you can find an articulated solution in our article on the indovinel):

Deepseek r1 response riddle bulbs

Well, right? He managed to think just like we would do. What we did at this point, however, was Repeat the same request but simplifying the riddle: we declare that the room in which the light bulb is located is in glass. A human being would respond promptly that, being the glass room, it will be enough for me to fool with the switches and look through the wall which of the three actually lights the light. The glass is transparent!

Guess three switches a light bulb

Deepseek R1, on the other hand, replied exactly as he had done in the case of the “classic riddle”. And this precisely because it is not thinking, but “rummaging in the drawers” of what he has learned, in search of the most plausible response, thus finding the classic solution to the indovinel of the three switches.

Solution

This shows us how the model is actually not thinking, but responding in a probabilistic way, without necessarily giving the right weight to all the words it encounters along its way.