hero image

According to a study published on February 9, 2023, in the open-access journal PLOS Digital Health by AnsibleHealth’s Tiffany Kung, Victor Tseng, and their team, ChatGPT has the ability to achieve a score of about 60%, which is the minimum passing threshold for the United States Medical Licensing Exam (USMLE). The study further reports that ChatGPT’s responses are logically coherent, internally consistent, and frequently insightful.

ChatGPT produces human-like text

ChatGPT is a novel AI system that belongs to the category of large language models (LLMs). Its main function is to generate human-like text by predicting the sequence of words that will follow. What sets ChatGPT apart from most other chatbots is that it does not rely on internet searches to generate text. Instead, it uses internal processes to predict the relationships between words.

Researchers led by Kung conducted a study to evaluate ChatGPT’s performance on the USMLE exams, which are a set of three standardized and regulated exams (Steps 1, 2CK, and 3) that are required for medical licensure in the United States. These exams evaluate the knowledge of medical students and physicians-in-training across various medical disciplines, including biochemistry, diagnostic reasoning, and bioethics.

To conduct the study, the researchers removed image-based questions and administered ChatGPT to 350 of the 376 public questions that were made available in the June 2022 USMLE release

ChatGPT outperformed PubMedGPT    

The three USMLE exams were taken by ChatGPT, and their scores ranged from 52.4% to 75.0%, which is below the passing threshold of approximately 60% each year. However, ChatGPT had a 94.6% concordance rate for all its answers, and 88.9% of its responses provided at least one significant insight that was new, non-obvious, and clinically valid. It is worth noting that ChatGPT outperformed PubMedGPT, a model trained solely on biomedical domain literature, which achieved a score of 50.8% on an older set of USMLE-style questions.

Although the input size was limited, which constrained the extent and scope of the analyses, the authors acknowledge that their discoveries offer a glimpse of how ChatGPT has the capability to improve medical education and, ultimately, clinical practice.