Gene expression feature selection for prostate cancer diagnosis using a two-phase heuristic–deterministic search strategy

Here, a two-phase search strategy is proposed to identify the biomarkers in gene expression data set for the prostate cancer diagnosis. A statistical filtering method is initially employed to remove the noisiest data. In the first phase of the search strategy, a multi-objective optimisation based on the binary particle swarm optimisation algorithm tuned by a chaotic method is proposed to select the optimal subset of genes with the minimum number of genes and the maximum classification accuracy. Finally, in the second phase of the search strategy, the cache-based modification of the sequential forward floating selection algorithm is used to find the most discriminant genes from the optimal subset of genes selected in the first phase. The results of applying the proposed algorithm on the available challenging prostate cancer data set demonstrate that the proposed algorithm can perfectly identify the informative genes such that the classification accuracy, sensitivity, and specificity of 100% are achieved with only nine biomarkers.
Source: IET Systems Biology - Category: Biology Source Type: research