Comprehensive analysis of the performance of GPT-3.5 and GPT-4 on the American Urological Association self-assessment study program exams from 2012-2023

CONCLUSIONS: GPT-4 scored significantly higher than GPT-3.5 on the AUA SASP exams in overall performance, across all test years, and in various urology topic areas. This suggests improvement in evolving AI language models in answering clinical urology questions; however, certain aspects of medical knowledge and clinical reasoning remain challenging for AI language models.PMID:38381942 | DOI:10.5489/cuaj.8526

https://pubmed.ncbi.nlm.nih.gov/38381942/?utm_source=no_user_agent&utm_medium=rs...

Source: Canadian Urological Association Journal - February 21, 2024 Category: Urology & Nephrology Authors: Ali Sherazi David Canes Source Type: research