Evaluation of ChatGPT and Gemini large language models for pharmacometrics with NONMEM

AbstractTo assess ChatGPT 4.0 (ChatGPT) and Gemini Ultra 1.0 (Gemini) large language models on NONMEM coding tasks relevant to pharmacometrics and clinical pharmacology. ChatGPT and Gemini were assessed on tasks mimicking real-world applications of NONMEM. The tasks ranged from providing a curriculum for learning NONMEM, an overview of NONMEM code structure to generating code. Prompts in lay language to elicit NONMEM code for a linear pharmacokinetic (PK) model with oral administration and a more complex model with two parallel first-order absorption mechanisms were investigated. Reproducibility and the impact of “temperature” hyperparameter settings were assessed. The code was reviewed by two NONMEM experts. ChatGPT and Gemini provided NONMEM curriculum structures combining foundational knowledge with advanced concepts (e.g., covariate modeling and Bayesian approaches) and practical skills including NON MEM code structure and syntax. ChatGPT provided an informative summary of the NONMEM control stream structure and outlined the key NONMEM Translator (NM-TRAN) records needed. ChatGPT and Gemini were able to generate code blocks for the NONMEM control stream from the lay language prompts for the two coding tasks. The control streams contained focal structural and syntax errors that required revision before they could be executed without errors and warnings. The code output from ChatGPT and Gemini was not reproducible, and varying the temperature hyperparameter did not redu...
Source: Journal of Pharmacokinetics and Pharmacodynamics - Category: Drugs & Pharmacology Source Type: research