The benefits and pitfalls of machine learning for biomarker discovery

AbstractProspects for the discovery of robust and reproducible biomarkers have improved considerably with the development of sensitive omics platforms that can enable measurement of biological molecules at an unprecedented scale. With technical barriers to success lowering, the challenge is now moving into the analytical domain. Genome-wide discovery presents a problem of scale and multiple testing as standard statistical methods struggle to distinguish signal from noise in increasingly complex biological systems. Machine learning and AI methods are good at finding answers in large datasets, but they have a tendency to overfit solutions. It may be possible to find a local answer or mechanism in a specific patient sample or small group of samples, but this may not generalise to wider patient populations due to the high likelihood of false discovery. The rise of explainable AI offers to improve the opportunity for true discovery by providing explanations for predictions that can be explored mechanistically before proceeding to costly and time-consuming validation studies. This review aims to introduce some of the basic concepts of machine learning and AI for biomarker discovery with a focus on post hoc explanation of predictions. To illustrate this, we consider how explainable AI has already been used successfully, and we explore a case study that applies AI to biomarker discovery in rheumatoid arthritis, demonstrating the accessibility of tools for AI and machine learning. We ...
Source: Cell and Tissue Research - Category: Cytology Source Type: research