Molecules, Vol. 24, Pages 2414: A Ligand-Based Virtual Screening Method Using Direct Quantification of Generalization Ability

Molecules, Vol. 24, Pages 2414: A Ligand-Based Virtual Screening Method Using Direct Quantification of Generalization Ability Molecules doi: 10.3390/molecules24132414 Authors: Weixing Dai Dianjing Guo Machine learning plays an important role in ligand-based virtual screening. However, conventional machine learning approaches tend to be inefficient when dealing with such problems where the data are imbalanced and features describing the chemical characteristic of ligands are high-dimensional. We here describe a machine learning algorithm LBS (local beta screening) for ligand-based virtual screening. The unique characteristic of LBS is that it quantifies the generalization ability of screening directly by a refined loss function, and thus can assess the risk of over-fitting accurately and efficiently for imbalanced and high-dimensional data in ligand-based virtual screening without the help of resampling methods such as cross validation. The robustness of LBS was demonstrated by a simulation study and tests on real datasets, in which LBS outperformed conventional algorithms in terms of screening accuracy and model interpretation. LBS was then used for screening potential activators of HIV-1 integrase multimerization in an independent compound library, and the virtual screening result was experimentally validated. Of the 25 compounds tested, six were proved to be active. The most potent compound in experimental validation showed an EC50 value of 0.71 µM.
Source: Molecules - Category: Chemistry Authors: Tags: Article Source Type: research