Identification of potential biomarkers on microarray data using distributed gene selection approach.

Identification of potential biomarkers on microarray data using distributed gene selection approach. Math Biosci. 2019 Jul 18;:108230 Authors: Shukla AK, Tripathi D Abstract In recent times, several feature selection (FS) methods have introduced to identify the biomarkers from gene expression datasets. It has gained extensive attention to solve cancer classification problem, but they have some limitations. First, the majority of FS approaches increases the computational cost due to the centralized data structure. Second, an irrelevant ranked gene that could perform well regarding classification accuracy with suitable subset of genes will be left out of the selection. To resolve these problems, we introduce a novel two-stage FS approach by combining Spearman's Correlation (SC) and distributed filter FS methods which can select the highly discriminative genes for distinguishing samples from high dimensional datasets. Concerning distributed FS, data is distributed by features according to vertical distribution and then performs a merging procedure which updates the feature subset along with improved classification accuracy. Moreover, it is used to quantify the relation between gene-gene and the gene-class and simultaneously detect subsets of essential genes. The proposed method is verified on six gene datasets with the help of four well-known classifiers namely, support vector machine, naïve Bayes, k-nearest neighbor, and decision tree...
Source: Mathematical Biosciences - Category: Statistics Authors: Tags: Math Biosci Source Type: research