G4Bank: A database of experimentally identified DNA  G-quadruplex sequences

AbstractG-quadruplex (G4), a non-canonical nucleic acid structure, has been suggested to play a key role in important cellular processes including transcription, replication and cancer development. Recently, high-throughput sequencing approaches for G4 detection have provided a large amount of experimentally identified G4 data that reveal genome-wide G4 landscapes and enable the development of new methods for predicting potential G4s from sequences. Although several existing databases provide G4 experimental data and relevant biological information from different perspectives, there is no dedicated database to collect and analyze DNA G4 experimental data genome-widely. Here, we constructed G4Bank, a database of experimentally identified DNA G-quadruplex sequences. A total of 6,915,983 DNA G4s were collected from 13 organisms, and state-of-the-art prediction methods were performed to filter and analyze the G4 data. Therefore, G4Bank will facilitate users to access comprehensive G4 experimental data and enable sequence feature analysis of G4 for further investigation. The database of the experimentally identified DNA G-quadruplex sequences can be accessed athttp://tubic.tju.edu.cn/g4bank/.Graphical abstract
Source: Interdisciplinary Sciences, Computational Life Sciences - Category: Bioinformatics Source Type: research