In Silico Prediction of Inhibitory Constant of Thrombin Inhibitors Using Machine Learning.

This study was carried out to predict Ki values of thrombin inhibitors based on a large data set by using machine learning methods. Taking advantage of finding non-intuitive regularities on high-dimensional datasets, machine learning can be used to build effective predictive models. A total of 6554 descriptors for each compound were collected and an efficient descriptor selection method was chosen to find the appropriate descriptors. Four different methods including multiple linear regression (MLR), K nearest neighbors (KNN), gradient boosting regression tree (GBRT) and support vector machine (SVM) were implemented to build prediction models with these selected descriptors. RESULTS: The SVM model was the best one among these methods with R2=0.84, MSE=0.55 for training set and R2=0.83, MSE=0.56 for test set. Several validation methods such as y-randomization test and applicability domain evaluation, were adopted to assess the robustness and generalization ability of the model. The final model shows excellent stability and predictive ability and can be employed for rapid estimation of the inhibitory constant, which is full of help for designing novel thrombin inhibitors. PMID: 30569853 [PubMed - as supplied by publisher]
Source: Combinatorial Chemistry and High Throughput Screening - Category: Chemistry Authors: Tags: Comb Chem High Throughput Screen Source Type: research