Predicting antimicrobial resistance using conserved genes

In this study, we explore the possibility of predicting AMR phenotypes using incomplete genome sequence data. Models were built from small sets of randomly-selected core genes after removing the AMR genes. ForKlebsiella pneumoniae,Mycobacterium tuberculosis,Salmonella enterica, andStaphylococcus aureus, we report that it is possible to classify susceptible and resistant phenotypes with average F1 scores ranging from 0.80 –0.89 with as few as 100 conserved non-AMR genes, with very major error rates ranging from 0.11–0.23 and major error rates ranging from 0.10–0.20. Models built from core genes have predictive power in cases where the primary AMR mechanisms result from SNPs or horizontal gene transfer. By rando mly sampling non-overlapping sets of core genes, we show that F1 scores and error rates are stable and have little variance between replicates. Although these small core gene models have lower accuracies and higher error rates than models built from the corresponding assembled genomes, the results s uggest that sufficient variation exists in the core non-AMR genes of a species for predicting AMR phenotypes.
Source: PLoS Computational Biology - Category: Biology Authors: Source Type: research