OpenAI’s ChatGPT-4.0 Surpasses Human Performance in Neurology Exam

Dec 12, 2023

OpenAI’s sophisticated language model, ChatGPT-4.0, has made a significant stride forward in the intersection of artificial intelligence and healthcare. In a recent proof-of-concept study, this large language model (LLM) successfully passed a clinical neurology exam, answering 85% of the questions correctly. This impressive accomplishment suggests that with further refinements, LLMs could carve out a significant niche in the realm of clinical neurology.

The groundbreaking study was conducted by a team of researchers hailing from the University Hospital Heidelberg and the German Cancer Research Center in Heidelberg. The results, published on December 7, underscore the potential of AI in reshaping healthcare. The test administered to ChatGPT-4.0 included a set of questions from the American Board of Psychiatry and Neurology, supplemented by a selection from the European Board for Neurology.

ChatGPT-4.0 outperformed its predecessor, ChatGPT-3.5, which answered 1,306 out of 1,956 questions correctly, achieving a score of 66.8%. The newer model, however, marked an achievement by correctly answering 1,662 questions, amounting to an 85% success rate. To put this into perspective, the average human score stands at 73.8%, indicating that ChatGPT-4.0 surpassed human performance. Moreover, with 70% being the standard passing score in academia, it’s safe to say that ChatGPT-4.0 successfully passed the neurology exam.

However, the study also highlighted areas where the models could improve. For instance, both models demonstrated weaker performance in tasks requiring “higher-order thinking” compared to those needing only “lower-order thinking.” This indicates that while LLMs show promise in the clinical neurology field, there is still room for improvement.

Despite these limitations, the researchers involved in the study are optimistic about the potential applications of LLMs in clinical neurology. Dr. Varun Venkataramani, one of the authors of the study, explained to Cointelegraph, “We see our study more as a proof-of-concept for the capabilities of LLMs. There is still development needed and probably even specific fine-tuning of LLMs to make them properly applicable for clinical neurology.”

The successful use of AI in significant healthcare tasks, such as AstraZeneca’s cancer research or combating antibiotic overprescription in Hong Kong, underscores the potential of this technology. The recent achievement of ChatGPT-4.0 in passing a neurology exam signals an exciting future for AI in healthcare, marking another step forward in the journey towards AI-driven medical advancements.