Atom based linear index descriptors in QSAR-machine learning classifiers for the prediction of ubiquitin-proteasome pathway activity

This report showed the use of the atom-based linear index together with different classic and machine learning classification techniques in a QSAR (quantitative structure-activity relationship) study. A PubChem BioAssay DataSet composed by 705 compounds with inhibitory (258 chemicals) and non-inhibitory (447 compounds) activity against the ubiquitin-proteasome pathway were used. The classification models were developed using the linear discriminant analysis, support vector machine, Bayesian networks, k-nearest neighbor, and random forest techniques. In this sense, all the QSAR models show accuracies above 85% in the training set and values of the Matthews correlation coefficient ranging from 0.68 to 0.83. The external validation set shows adequate classifications between 81.25 and 86.36% and Matthews ’s correlation coefficient values ranging from 0.59 to 0.70. The present approach contributes as a useful tool for the early detection of novel UPP inhibitors for the treatment of the multiple myeloma and related diseases.Graphical AbstractA dataset of 705 compounds was extracted from PubChem, with 258 active and 447 non-active compounds in ubiquitin-proteasome pathway inhibitory activity. Later this dataset was divided in training and set, consisting of 529 and 176 compounds, respectively. The quality of the QSAR models developed was proved using the validation set and also checking the applicability domain.
Source: Medicinal Chemistry Research - Category: Chemistry Source Type: research