'ChatGPT shows 'inappropriate recommendation' for cancer treatment'
"We need to raise awareness that LLMs are not the equivalent of trained medical professionals," said author Shan Chen.
SAN FRANSCISCO: Researchers have said that OpenAI's AI chatbot ChatGPT 3.5 provided inappropriate ("non-concordant") recommendations for cancer treatment, highlighting the need for awareness of the technology's limitations, a new study has shown.
The researchers prompted the AI chatbot to provide treatment advice that aligned with guidelines established by the National Comprehensive Cancer Network (NCCN), according to the study published in the journal JAMA Oncology.
"ChatGPT responses can sound a lot like a human and can be quite convincing. But, when it comes to clinical decision-making, there are so many subtleties for every patient's unique situation. A right answer can be very nuanced, and not necessarily something ChatGPT or another large language model can provide," said corresponding author Danielle Bitterman, MD, of the Department of Radiation Oncology at the US-based Mass General Brigham.
The researchers focused on the three most common cancers (breast, prostate and lung cancer) and prompted ChatGPT to provide a treatment approach for each cancer based on the severity of the disease.
In total, they included 26 unique diagnosis descriptions and used four, slightly different prompts.
According to the study, nearly all responses (98 per cent) included at least one treatment approach that agreed with NCCN guidelines. However, the researchers found that 34 per cent of these responses also included one or more non-concordant recommendations, which were sometimes difficult to detect amidst otherwise sound guidance.
In 12.5 per cent of cases, ChatGPT produced "hallucinations," or a treatment recommendation entirely absent from NCCN guidelines, which included recommendations of novel therapies, or curative therapies for non-curative cancers.
The researchers stated that this form of misinformation can incorrectly set patients’ expectations about treatment and potentially impact the clinician-patient relationship.
"Users are likely to seek answers from the LLMs to educate themselves on health-related topics -- similarly to how Google searches have been used. At the same time, we need to raise awareness that LLMs are not the equivalent of trained medical professionals," said first author Shan Chen, MS, of the Artificial Intelligence in Medicine (AIM) Programme.