On the Identification of Clinically Relevant Bacterial Amino Acid Changes at the Whole Genome Level Using Auto-PSS-Genome

AbstractThe identification of clinically relevant bacterial amino acid changes can be performed using different methods aimed at the identification of genes showing positively selected amino acid sites (PSS). Nevertheless, such analyses are time consuming, and the frequency of genes showing evidence for PSS can be low. Therefore, the development of a pipeline that allows the quick and efficient identification of the set of genes that show PSS is of interest. Here, we present Auto-PSS-Genome, a Compi-based pipeline distributed as a Docker image, that automates the process of identifying genes that show PSS using three different methods, namely codeML, FUBAR, and omegaMap. Auto-PSS-Genome accepts as input a set of FASTA files, one per genome, containing all coding sequences, thus minimizing the work needed to conduct positively selected sites analyses. The Auto-PSS-Genome pipeline identifies orthologous gene sets and corrects for multiple possible problems in input FASTA files that may prevent the automated identification of genes showing PSS. A FASTA file containing all coding sequences can also be given as an external global reference, thus easing the comparison of results across species, when gene names are different. In this work, we use Auto-PSS-Genome to analyseMycobacterium leprae (that causes leprosy), and the closely related speciesM. haemophilum, that mainly causes ulcerating skin infections and arthritis in persons who are severely immunocompromised, and in children ...
Source: Interdisciplinary Sciences, Computational Life Sciences - Category: Bioinformatics Source Type: research