recensioni possiamo fidarci

Can we trust online reviews? What the numbers don’t say

Does it make sense to choose a hotel based only on the average of online reviews? Or buy a cosmetic product just because “90% have seen improvements”? Every day we make decisions based on numbers and statistics: but what happens when the data is missing? Whether it is information never collected, neglected or deliberately omitted, Even what we are not told counts. And the story he tells can even prove to be more interesting than that of the data we have.

Online reviews: the importance of missing data

When we have to choose a restaurant or hotel, more and more often check the Online reviews And we compare the averages of the assessments. But how much can we trust this information? Several platforms allow you to leave a review only to those who have actually benefited from the service, making the average score more reliable. But is it enough and reliable?

Studies tell us that only a minority of people reviews their experiences online. Understand why some reviews are missing it therefore becomes essential for evaluate the reliability of the average score.

Online reviews

The reasons why reviews can be missing are manifold and the statistics classify them in three categories:

  • The reviews that are missing, for example, for a server loading error, are data That they are missing for pure case (Mcar: Missing Completely at Random). On these occasions, the reviews available can however be considered representative.
  • If instead of the reviews they are missing because some groups – for example, the elderly – are less inclined to write online reviews, then we are faced with absent data for reasons related to others observable variables (Mar: Missing at Random). This can distort the average scorebut on some platforms we can partially correct it by filtering the reviews based on the profile of the user who wrote them, for example by age group.
  • The most critical case is when people decide to Don’t leave a review precisely based on the type of experience lived. For example, many people decide not to write a review when they have had mediocre or negative experience in these situations, The average can be misleadingbecause the absence of the evaluation is linked precisely to the review of the review (Mnar: Missing Not at Random)

When few speak for everyone: the risk of generalizing too much

Studies tell us that those who write reviews do it, more frequently, after Very positive experiences or very negative. The “normal” experiences disappear in silence, generating a fe2nomeian known as selection bias. The result?

Especially when the reviews are few and extreme, the average risks reflecting more a compromise between enthusiasm of some and the frustration of others, that not a realistic indicator of the experiences lived. That’s why, especially in the presence of few reviewsit is better to be cautious. Caution is also a must in front of suspicious patterns, like many reviews all written on the same day or without detailed comments. For example, in front of a trattoria with 4.8 stars out of 15 reviews written all on the same day and all very generic.

The risk of generalizing on the basis of incomplete data does not only concern online reviews. Also in the Clinical and cosmetic researchthe missing data can generate significant distortions. How many times do we see the results of apparently very promising clinical studies advertised? Or we read phrases like: “Cream on 100 people: 90% saw improvements!”. But how many were the initial participants in the study? If 90 out of 100 noticed improvements, but 30 people came out of the experimentation before its end, the results could be much less exciting.

And this is far from rare: statistics show that The drop-out rate (i.e. early exit from the study) It can exceed 20%. Those missing data could hide not only an absence of improvements, but even side effects not reported e The final data they will appear more positive than they should.

Beyond the data: the value of the absence

Not all missing data is a problem. Sometimes, their absence is intentional and useful.

For example, when the review does not concern a hotel but delicate themessuch as the evaluation of your work environment or its university, The anonymity of the respondents can increase the sincerity of the answers. In these situations, the missing data, as the name and surname of the respondents, can positively influence the willingness to share delicate information, improving the overall quality of the results.

The same applies to scientific research. The best clinical experiments are those “blind”that is, when neither the patient nor the experimenter know which treatment is administered. Sometimes, not even the researchers who follow the study know (double blind) or even those who analyze the data (triple blind). Keeping that key data hidden on the treatment performed helps to isolate the real effectiveness of the treatment from suggestions, expectations or prejudices.

Each absent data, each review that is missing, can help us better evaluate the truthfulness of the information we receive. Even without statistical tools, we can always ask ourselves: “Who is it that he left no information? And why?”