December 17, 2024

How does an AI chatbot work? Learning and the role of supercomputers

Chatbot AI refers to a system based on artificial intelligence that interacts with users by simulating a human conversation. Behind an AI Chatbot, even if it functions, there are many very complex things, but basically there are two main elements: the artificial intelligence model and the physical components, the hardware. The model is a bit like the brain of the chatbot, while the hardware is a bit like the muscles: they provide the brain with the power needed to do complex calculations. Specifically for an AI chatbot, you need very powerful hardware: a supercomputers. We have an example right here in Italy, in the province of Bergamo, capable of developing new applications, including chatbots in Italian.

How is a model trained for an AI chatbot?

First of all, we need to understand that a chatbot must be trained before being able to respond and to do this there is a process based on artificial intelligence, specifically the famous machine learningthrough which the model learns thanks to an enormous amount of data, texts and images that are fed to it. Ok, but in concrete terms, how does the model learn?

There are three main learning stages.
The first phase is a bit like elementary school: you learn language skills and general notions. It is in this phase that the model “learns the Italian language”. How?
In practice, the model is given a series of quality verified texts, such as a Geopop article, from which some words have been randomly hidden. Now, what is required of the model is to fill these empty spaces, a bit like the puzzle game, fill the blank. Initially it will make a lot of errors but, gradually, every time the chatbot makes errors, its parameters are modified. In this way, by learning to predict the word, the model also learns a whole series of related information: the syntactic structure of the sentence, the grammar and then a whole series of notions.
The second phase, however, is as if it were university, because here it learns more specialized skills: after learning Italian, the model is given a series of questions with multiple answers and the model is “taught” to carry out tasks. The third and final phase is a bit like the final exam: the effectiveness of the answers is verified by a team of real people, who check the results of the assigned tasks. In this phase you can also set the type of behavior expected, therefore the way and tone with which you expect the model to respond (for example in a formal way, or playfully, youthfully or with technical language, depends on the type of question ). Once we are satisfied with the learning level, the model is ready to be used and will no longer learn new information, at least until the next update.
So, be careful: chatbots don’t go looking for the answer on the internet, but generate an answer thanks to their own model, their own brain that has been trained. Ideally, if I had a lightweight generative chatbot, i.e. one that can only perform a few functions, I could also make it work on my PC not connected to the internet. Technically it would work.
Obviously the large international chatbots that know how to do everything are connected to the network to also be constantly updated and be careful, they use very powerful engines and muscles: supercomputers.

Understanding questions and generating answers: behind the customer experience

Their computing power is used in particular to be able to understand what we write to them and to formulate responses, which technically translates into data management and the necessary operations.

To understand our language, the bot divides the sentence into bricks, that is, into single units which are called tokens, and in fact it is said that it tokenises the question received. To give an example, if the question were to be “how do you prepare an apple pie?”, the bot would break the sentence down into 8 tokens:

As
Yes
prepare
a
cake
Of
apples
question mark

This way it will be easier for him to analyze it and recognize the most important parts of speech.
In fact, the next step is precisely to recognize “the most relevant bricks”, namely the keywords of the sentence.
In our case he identifies the words “cake” and “prepare”, and at this point he understands that, most likely, to answer the question he will have to dig into the “drawers” of his memory linked to the world of cooking.
Once the question is understood, to answer it the chatbot moves on to the Natural Language Generation phase, i.e. the generation of the answer.
So, in this case, it selects the most important data relating to the apple pie recipe (among those it learned during the training phase), organizes the information in a logical structure, composes the sentences, connects them together and generates the text. And here is our answer.

The first chatbot made in Italy

Since July in Italy we have had a supercomputer entirely dedicated to generative artificial intelligence. It was installed by Fastweb to create chatbots in Italian that can be used by companies and public administrations.

But what do we need a specifically Italian chatbot for? The advantages are many. First of all, being trained on Italian data, he will speak Italian very well. And not only with regards to grammar, but also for all those facets of the language and those cultural nuances that, normally, only those who live in the area can fully understand. Furthermore, it will respect the copyright rights on the content used for training, since it is data regularly acquired following agreements with publishers. Even companies that want to develop their chatbot will be more protected, knowing that the servers are here in Italian territory: it is not trivial.
Secondly, since it is a model entirely developed from scratch, you will have full control over all development phases, so, for example, you can choose the type of data to use for training. This model, specifically, will be trained on authoritative and verified data, thanks to agreements made with organizations of the caliber of Mondadori, Bignami and ISTAT.

Alexander Marchall

Alexander Marchall is a distinguished journalist with over 15 years of experience in the realm of international media. A graduate of the Columbia School of Journalism, Alex has a fervent passion for global affairs and geopolitics. Prior to founding The Journal, he contributed his expertise to several leading publications.