Pre-training molecular representation model with spatial geometry for property prediction

Comput Biol Chem. 2024 Feb 7;109:108023. doi: 10.1016/j.compbiolchem.2024.108023. Online ahead of print.ABSTRACTAI-enhanced bioinformatics and cheminformatics pivots on generating increasingly descriptive and generalized molecular representation. Accurate prediction of molecular properties needs a comprehensive description of molecular geometry. We design a novel Graph Isomorphic Network (GIN) based model integrating a three-level network structure with a dual-level pre-training approach that aligns the characteristics of molecules. In our Spatial Molecular Pre-training (SMPT) Model, the network can learn implicit geometric information in layers from lower to higher according to the dimension. Extensive evaluations against established baseline models validate the enhanced efficacy of SMPT, with notable accomplishments in classification tasks. These results emphasize the importance of spatial geometric information in molecular representation modeling and demonstrate the potential of SMPT as a valuable tool for property prediction.PMID:38335852 | DOI:10.1016/j.compbiolchem.2024.108023
Source: Computational Biology and Chemistry - Category: Bioinformatics Authors: Source Type: research