Efficient Mining of Variants From Trios for Ventricular Septal Defect Association Study

Ventricular septal defect (VSD) is a fatal congenital heart disease showing severe consequence in affected infants, to which early diagnosis plays an important role, particularly through genetic variants. Existing panel-based approaches of variants mining suffer from shortage of large panels, costly of sequencing and missing of rare variants. Although a trio-based method alleviates these limitations to some extent, it is agnostic to novel mutations and computational intensive. Considering these limitations, we are studying a novel variants mining algorithm from trio-based sequencing data and apply it on a VSD trio to identify associated mutations. Our approach starts with irrelevant k-mer filtering from sequences of a trio via a newly conceived coupled-Bloom Filter, then corrects sequencing errors by using a statistical approach and extends kept k-mers into long sequences. These extended sequences are used as input for variants calling. Later, the obtained variants are comprehensively analyzed against existing databases to mine VSD-related mutations. Experiments show that our trio-based algorithm narrows down candidate coding genes and lncRNAs by about 10 and 5 folds comparing to single sequence based approaches, respectively. Meanwhile, our algorithm is 10 times faster and 2 magnitudes memory-frugal comparing with existing state-of-the-art approach. By applying our approach to a VSD trio, we fish out an unreported gene—CD80, a combination of two genes—MYBPC3 and TRDN and...
Source: Frontiers in Genetics - Category: Genetics & Stem Cells Source Type: research