Combining Clustering and Classification Ensembles: A Novel Pipeline to Identify Breast Cancer Profiles

Publication date: Available online 15 May 2019Source: Artificial Intelligence in MedicineAuthor(s): Utkarsh Agrawal, Daniele Soria, Christian Wagner, Jonathan Garibaldi, Ian O. Ellis, John M.S. Bartlett, David Cameron, Emad A. Rakha, Andrew R. GreenAbstractBreast Cancer is one of the most common causes of cancer death in women, representing a very complex disease with varied molecular alterations. To assist breast cancer prognosis, the classification of patients into biological groups is of great significance for treatment strategies. Recent studies have used an ensemble of multiple clustering algorithms to elucidate the most characteristic biological groups of breast cancer. However, the combination of various clustering methods resulted in a number of patients remaining unclustered. Therefore, a framework still needs to be developed which can assign as many unclustered (i.e. biologically diverse) patients to one of the identified groups in order to improve classification. Therefore, in this paper we develop a novel classification framework which introduces a new ensemble classification stage after the ensemble clustering stage to target the unclustered patients. Thus, a step-by-step pipeline is introduced which couples ensemble clustering with ensemble classification for the identification of core groups, data distribution in them and improvement in final classification results by targeting the unclustered data. The proposed pipeline is employed on a novel real world breast...
Source: Artificial Intelligence in Medicine - Category: Bioinformatics Source Type: research