Dark Data in Real-World Evidence: Challenges, Implications, and the Imperative of Data Literacy in Medical Research

In this study, we explore the impact of extrinsic factors on RWE outcomes, focusing on "dark data," which refers to data collected but not used or excluded from the analyses. Dark data can arise in many ways during research process, from selecting study samples to data collection and analysis. However, even unused or unanalyzed dark data hold potential insights, providing a comprehensive view of clinical contexts. Extrinsic factors lead to divergent RWE outcomes that could differ from RCTs beyond statistical correction's scope. Two main types of dark data exist: "known-unknown" and "unknown-unknown." The distinction between these dark data types highlights RWE's complexity. The transformation of unknown into known depends on data literacy-powerful utilization capabilities that can be interpreted based on medical expertise. Shifting the focus to excluded subjects or unused data in real-world contexts reveals unexplored potential. Understanding the significance of dark data is vital in reflecting the complexity of clinical settings. Connecting RCTs and RWEs requires medical data literacy, enabling clinicians to decipher meaningful insights. In the big data and artificial intelligence era, medical staff must navigate data complexities while promoting the core role of medicine. Prepared clinicians will lead this transformative journey, ensuring data value shapes the medical landscape.PMID:38469965 | PMC:PMC10927386 | DOI:10.3346/jkms.2024.39.e92
Source: J Korean Med Sci - Category: General Medicine Authors: Source Type: research