Experts have warned against relying on AI-powered chatbots for health and medical information, stressing that they may provide inaccurate or misleading answers.
A recent study revealed that these systems, such as ChatGPT and Grok, suffer from what is known as "hallucination," a phenomenon that leads to the production of false or incomplete information, despite its presentation in a seemingly trustworthy manner.
In a study involving 50 medical questions, nearly half of the answers provided by chatbots were found to be "problematic." The results showed that Grok had the highest error rate (58%), followed by ChatGPT (52%), and then Meta AI (50%).
The researchers pointed out that the reason for these errors is due to these models relying on training data that may be biased or incomplete, in addition to their occasional tendency towards what is known as "courtesy," that is, providing answers that conform to the user's beliefs instead of adhering to scientific accuracy.
They also pointed out that these systems are not licensed to provide medical advice, and do not always have access to the latest information, which makes their use in this field risky without expert supervision.
The study relied on asking a set of common questions to several chatbots, including topics such as the effectiveness of vitamin D supplements, the safety of COVID-19 vaccines, the risks of vaccinating children, as well as questions about cancer, stem cells and diets.
The results showed that these systems performed relatively better in topics related to vaccines and cancer, while they declined in areas such as nutrition, athletic performance, and stem cell-based therapies.
The researchers emphasized that chatbots do not actually analyze scientific evidence, but rather rely on statistical prediction to generate texts, which means they may provide answers that seem accurate but lack reliability.
Previous research has also revealed that a large percentage of the references cited by these systems may be inaccurate or even fabricated, with the percentage of correct references not exceeding 32% in one study.
The researchers stressed the need to enhance public awareness, develop regulatory controls, and provide appropriate professional training to ensure that artificial intelligence is used in a way that supports public health rather than harms it.
The results were published in he journal BMJ Open.
