A study to find a potent feature by combining the various disulphide bonds of protein using data mining technique

AbstractIn the sphere of bioinformatics, the identification of an effective protein feature, is of the essence. The fruitfulness of any classification technique, relies heavily on the identification of informative and distinct features. Various pre-existing classifiers recognised the use of a single type of disulphide bond (viz, parallel, or alternate) as a useful feature. However, the computational efficiency may be increased by the identification of appropriate combination of disulphide bonds, as a single feature. Hence, in this paper, the various combinations of disulphide bonds have been studied, to formulate a potent protein feature. It can be utilised in various studies, for achieving better protein classification results, without incorporating redundant data. After that, a data mining approach has been applied on the seven different combinations of disulphide bonds (viz. parallel, alternate and quad) to identify the best feature. A statistical analysis conducted in terms of confusion matrix and various point metrics (such as, sensitivity, specificity, recall and precision), resulted in a high level of accuracy andF score, for the feature, formed by the combination of two disulphide bonds i.e. alternative and quad bond. The averageF Score achieved in this combination is approximately, 0.9 and the average accuracy level turned out to be more than 93%. These turn out to be an unprecedented level of precision for any individual feature, considered so far, in any research m...
Source: Network Modeling Analysis in Health Informatics and Bioinformatics - Category: Bioinformatics Source Type: research