A comparison of survival analysis methods for cancer gene expression RNA-Sequencing data

Publication date: Available online 12 April 2019Source: Cancer GeneticsAuthor(s): Pichai Raman, Samuel Zimmerman, Komal S. Rathi, Laurence de Torrenté, Mahdi Sarmady, Chao Wu, Jeremy Leipzig, Deanne M. Taylor, Aydin Tozeren, Jessica C. MarAbstractIdentifying genetic biomarkers of patient survival remains a major goal of large-scale cancer profiling studies. Using gene expression data to predict the outcome of a patient's tumor makes biomarker discovery a compelling tool for improving patient care. As genomic technologies expand, multiple data types may serve as informative biomarkers, and bioinformatic strategies have evolved around these different applications. For categorical variables such as a gene's mutation status, biomarker identification to predict survival time is straightforward. However, for continuous variables like gene expression, the available methods generate highly-variable results, and studies on best practices are lacking. We investigated the performance of eight methods that deal specifically with continuous data. K-means, Cox regression, concordance index, D-index, 25th-75th percentile split, median-split, distribution-based splitting, and KaplanScan were applied to four RNA-sequencing (RNA-seq) datasets from the Cancer Genome Atlas. The reliability of the eight methods was assessed by splitting each dataset into two groups and comparing the overlap of results. Gene sets that had been identified from the literature for a specific tumor type served as pos...
Source: Cancer Genetics - Category: Cancer & Oncology Source Type: research