ConCreT, a 2D convolutional neural network for taxonomic classification applied to viruses in the phylum Cressdnaviricota

J Virol Methods. 2023 Aug 1:114789. doi: 10.1016/j.jviromet.2023.114789. Online ahead of print.ABSTRACTTaxonomic assignments allow scientists to communicate better with each other. In virology, taxonomy is continually improving towards a more precise and comprehensive framework. With the huge numbers of new viruses being described in metagenomic studies, automated taxonomy tools are urgently needed. A number of such tools have been proposed, and those applying machine learning (ML), mainly in the deep learning branch, stand out with accurate results. Still, there is a demand for tools that are less computationally intensive and that can classify viruses down to the ranks of genus and species. Cressdnaviruses are good subjects for testing these such tools, due to their small, circular genomes and the existence of several families and genera with a highly imbalanced number of species. We developed a 2D convolutional neural network for virus taxonomy, and tested it for classification of viruses from the phylum Cressdnaviricota. We obtained >98% accuracy in the final pipeline tested, which we named ConCreT (Convolutional Neural Network for Cressdnavirus Taxonomy). The mixture of augmentation for more imbalanced groups with no augmentation for more balanced ones achieved the best score in the final test.PMID:37536450 | DOI:10.1016/j.jviromet.2023.114789
Source: Journal of Virological Methods - Category: Virology Authors: Source Type: research