AI Chatbot Outperforms Clinicians in Diagnosis Probability - EMJ

Artificial Intelligence Chatbot Outperforms Clinicians in Diagnosis Probability

ARTIFICIAL intelligence (AI) chatbots, specifically the learning language model (LLM) ChatGPT-4 (OpenAI, San Francisco, California, USA) outperformed human clinicians in probabilistic reasoning when estimating the probability of a diagnosis following a negative test result, according to a recent study led by Adam Rodman, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA.

Probabilistic reasoning, the ability to make decisions based on calculating odds, is a challenging aspect of diagnosis. In the study, researchers provided ChatGPT-4 with the same five clinical cases used in a national practitioner survey (n=553), covering conditions such as pneumonia, breast cancer, asymptomatic bacteriuria, coronary artery disease, and urinary tract infection. The chatbot adjusted its estimates after receiving test results for each case.

The results indicated that the LLM exhibited less error in both pre- and post-test probability compared to clinicians, particularly for negative test results, across all five cases. For example, in the case of asymptomatic bacteriuria, the LLM had a median pretest probability of 26%, compared to 20% for clinicians, with a mean absolute error of 26.2, compared to 32.2, respectively.

However, the LLM did not perform as well when faced with positive test results. It demonstrated greater accuracy than clinicians in two cases, similar accuracy in two cases, and less accuracy in one case.

Rodman highlighted that humans sometimes perceive a higher risk than exists after a negative test result, leading to unnecessary treatments, additional tests, and medications. While the LLM is not perfect, its ease of use and potential integration into clinical workflows could contribute to better decision-making.

The study emphasised the need for future research into the collective use of AI in healthcare. Despite limitations in the study design, such as a simple prompt strategy and inclusion of simplistic cases, the findings underscore the potential for AI to enhance diagnostic processes and decision-making in medical settings.

Rate this content's potential impact on patient outcomes

Average rating / 5. Vote count:

No votes so far! Be the first to rate this content.

Thank you!

Please share some more information on the rating you have given