Pi Day: Carlos Bustamante: Models and Data in Biomedicine: What's Real and What's Noise? And, Why Should We Care?

Data Science Distinguished Seminar Series If you think of a scatterplot of data overlaid with a model for the data and ask practitioners from different fields, “what’s noise and what’s real?” the answers may surprise you. To a biologist, the data will almost surely be “what’s real” and the model is a poor approximation to the “truth.” To a physicist, the model is probably “what’s real” and the data is just a noisy realization of an underlying true physical process that we are attempting to study. As we think about the biomedical data enterprise in the 21st century and the massive amounts of data we generate (and want to analyze!), we need to support multiple world views and have guidance on how to translate noisy data and noisy models into actionable information. Dr. Bustamante's presentation will draw upon several examples from Population Genetics (a field very rich in theory) and Genomics (a field not so rich in theory and much more data driven) to illustrate these points. It will also touch upon reproducible research and the question of how funding agencies need to support ecosystems for collaborative research including data producers, consortia, and so called "research parasites” that may want to use the data in ways that go beyond what the original experimental designers envisioned. For more information go to https://datascience.nih.gov/PiDay2016/Schedule/LectureAir date: 3/14/2016 1:00:00 PM
