An end-to-end heterogeneous graph attention network for Mycobacterium tuberculosis drug-resistance prediction

In this study, we represent genetic data from Mycobacterium tuberculosis as a graph, and then adopt a deep graph learning method-heterogeneous graph attention network ('HGAT-AMR')-to predict anti-tuberculosis (TB) drug resistance. The HGAT-AMR model is able to accommodate incomplete phenotypic profiles, as well as provide 'attention scores' of genes and single nucleotide polymorphisms (SNPs) both at a population level and for individual samples. These scores encode the inputs, which the model is 'paying attention to' in making its drug resistance predictions. The results show that the proposed model generated the best area under the receiver operating characteristic (AUROC) for isoniazid and rifampicin (98.53 and 99.10%), the best sensitivity for three first-line drugs (94.91% for isoniazid, 96.60% for ethambutol and 90.63% for pyrazinamide), and maintained performance when the data were associated with incomplete phenotypes (i.e. for those isolates for which phenotypic data for some drugs were missing). We also demonstrate that the model successfully identifies genes and SNPs associated with drug resistance, mitigating the impact of resistance profile while considering particular drug resistance, which is consistent with domain knowledge.PMID:34414415 | DOI:10.1093/bib/bbab299
Source: Briefings in Bioinformatics - Category: Bioinformatics Authors: Source Type: research