Activity assessment of small drug molecules in estrogen receptor using multilevel prediction model

The authors have proposed an efficient multilevel prediction model for better activity assessment to test whether certain chemical compounds can disrupt processes in the human body that may create negative health effects. Here, a computational method (in-silico) is proposed for the quality prediction of drugs in terms of their activity, activity score, potency, and efficacy for estrogen receptors (ERs) by using various physicochemical properties (molecular descriptors). PaDEL-Descriptor is used for features extraction. The ER dataset has 8481 drug molecules where 1084 are active, and 7397 are inactive, and each drug molecule has 1444 features. This dataset is highly imbalanced and has a substantial number of features. Initially, a class imbalance problem is resolved through synthetic minority oversampling technique algorithm, and feature selection is done using FSelector library of R. A machine learning based multilevel prediction model is developed where classification is performed on its first level and regression on its second level. By using all these strategies simultaneously, outperformed accuracy is achieved in comparison to many other computational approaches. The K-fold cross-validation is performed to measure the consistency of the model for all the target classes. Finally, the validity of the proposed method on some AIDS therapy's drug molecules is proved.
Source: IET Systems Biology - Category: Biology Source Type: research