Zoish: A Novel Feature Selection Approach Leveraging Shapley Additive Values for Machine Learning Applications in Healthcare
Pac Symp Biocomput. 2024;29:81-95.ABSTRACTIn the intricate landscape of healthcare analytics, effective feature selection is a prerequisite for generating robust predictive models, especially given the common challenges of sample sizes and potential biases. Zoish uniquely addresses these issues by employing Shapley additive values-an idea rooted in cooperative game theory-to enable both transparent and automated feature selection. Unlike existing tools, Zoish is versatile, designed to seamlessly integrate with an array of machine learning libraries including scikit-learn, XGBoost, CatBoost, and imbalanced-learn.The distinc...
Source: Pacific Symposium on Biocomputing - December 31, 2023 Category: Bioinformatics Authors: Hossein Javedani Sadaei Salvatore Loguercio Mahdi Shafiei Neyestanak Ali Torkamani Daria Prilutsky Source Type: research

SynTwin: A graph-based approach for predicting clinical outcomes using digital twins derived from synthetic patients
Pac Symp Biocomput. 2024;29:96-107.ABSTRACTThe concept of a digital twin came from the engineering, industrial, and manufacturing domains to create virtual objects or machines that could inform the design and development of real objects. This idea is appealing for precision medicine where digital twins of patients could help inform healthcare decisions. We have developed a methodology for generating and using digital twins for clinical outcome prediction. We introduce a new approach that combines synthetic data and network science to create digital twins (i.e. SynTwin) for precision medicine. First, our approach starts by ...
Source: Pacific Symposium on Biocomputing - December 31, 2023 Category: Bioinformatics Authors: Jason H Moore Xi Li Jui-Hsuan Chang Nicholas P Tatonetti Dan Theodorescu Yong Chen Folkert W Asselbergs Mythreye Venkatesan Zhiping Paul Wang Source Type: research

Optimizing Computer-Aided Diagnosis with Cost-Aware Deep Learning Models
This study introduces a novel deep learning-based CAD system that incorporates a cost-sensitive parameter into the activation function. By applying our methodologies to two medical imaging datasets, our proposed study shows statistically significant increases of 3.84% and 5.4% in sensitivity while maintaining overall accuracy for Lung Image Database Consortium (LIDC) and Breast Cancer Histological Database (BreakHis), respectively. Our findings underscore the significance of integrating cost-sensitive parameters into future CAD systems to optimize performance and ultimately reduce costs and improve patient outcomes.PMID:38...
Source: Pacific Symposium on Biocomputing - December 31, 2023 Category: Bioinformatics Authors: Charmi Patel Yiyang Wang Thiruvarangan Ramaraj Roselyne Tchoua Jacob Furst Daniela Raicu Source Type: research

VetLLM: Large Language Model for Predicting Diagnosis from Veterinary Notes
Pac Symp Biocomput. 2024;29:120-133.ABSTRACTLack of diagnosis coding is a barrier to leveraging veterinary notes for medical and public health research. Previous work is limited to develop specialized rule-based or customized supervised learning models to predict diagnosis coding, which is tedious and not easily transferable. In this work, we show that open-source large language models (LLMs) pretrained on general corpus can achieve reasonable performance in a zero-shot setting. Alpaca-7B can achieve a zero-shot F1 of 0.538 on CSU test data and 0.389 on PP test data, two standard benchmarks for coding from veterinary notes...
Source: Pacific Symposium on Biocomputing - December 31, 2023 Category: Bioinformatics Authors: Yixing Jiang Jeremy A Irvin Andrew Y Ng James Zou Source Type: research

Impact of Measurement Noise on Genetic Association Studies of Cardiac Function
Pac Symp Biocomput. 2024;29:134-147.ABSTRACTRecent research has effectively used quantitative traits from imaging to boost the capabilities of genome-wide association studies (GWAS), providing further understanding of disease biology and various traits. However, it's important to note that phenotyping inherently carries measurement error and noise that could influence subsequent genetic analyses. The study focused on left ventricular ejection fraction (LVEF), a vital yet potentially inaccurate quantitative measurement, to investigate how imprecision in phenotype measurement affects genetic studies. Several methods of acqui...
Source: Pacific Symposium on Biocomputing - December 31, 2023 Category: Bioinformatics Authors: Milos Vukadinovic Gauri Renjith Victoria Yuan Alan Kwan Susan C Cheng Debiao Li Shoa L Clarke David Ouyang Source Type: research

A deep neural network estimation of brain age is sensitive to cognitive impairment and decline
Pac Symp Biocomput. 2024;29:148-162.ABSTRACTThe greatest known risk factor for Alzheimer's disease (AD) is age. While both normal aging and AD pathology involve structural changes in the brain, their trajectories of atrophy are not the same. Recent developments in artificial intelligence have encouraged studies to leverage neuroimaging-derived measures and deep learning approaches to predict brain age, which has shown promise as a sensitive biomarker in diagnosing and monitoring AD. However, prior efforts primarily involved structural magnetic resonance imaging and conventional diffusion MRI (dMRI) metrics without accounti...
Source: Pacific Symposium on Biocomputing - December 31, 2023 Category: Bioinformatics Authors: Yisu Yang Aditi Sathe Kurt Schilling Niranjana Shashikumar Elizabeth Moore Logan Dumitrescu Kimberly R Pechman Bennett A Landman Katherine A Gifford Timothy J Hohman Angela L Jefferson Derek B Archer Source Type: research

Session Introduction: Digital health technology data in biocomputing: Research efforts and considerations for expanding access (PSB2024)
Pac Symp Biocomput. 2024;29:163-169.ABSTRACTData from digital health technologies (DHT), including wearable sensors like Apple Watch, Whoop, Oura Ring, and Fitbit, are increasingly being used in biomedical research. Research and development of DHT-related devices, platforms, and applications is happening rapidly and with significant private-sector involvement with new biotech companies and large tech companies (e.g. Google, Apple, Amazon, Uber) investing heavily in technologies to improve human health. Many academic institutions are building capabilities related to DHT research, often in cross-sector collaboration with tec...
Source: Pacific Symposium on Biocomputing - December 31, 2023 Category: Bioinformatics Authors: Michelle Holko Chris Lunt Jessilyn Dunn Source Type: research

Expanding the access of wearable silicone wristbands in community-engaged research through best practices in data analysis and integration
Pac Symp Biocomput. 2024;29:170-186.ABSTRACTWearable silicone wristbands are a rapidly growing exposure assessment technology that offer researchers the ability to study previously inaccessible cohorts and have the potential to provide a more comprehensive picture of chemical exposure within diverse communities. However, there are no established best practices for analyzing the data within a study or across multiple studies, thereby limiting impact and access of these data for larger meta-analyses. We utilize data from three studies, from over 600 wristbands worn by participants in New York City and Eugene, Oregon, to pres...
Source: Pacific Symposium on Biocomputing - December 31, 2023 Category: Bioinformatics Authors: Lisa M Bramer Holly M Dixon David J Degnan Diana Rohlman Julie B Herbstman Kim A Anderson Katrina M Waters Source Type: research

Subject Harmonization of Digital Biomarkers: Improved Detection of Mild Cognitive Impairment from Language Markers
Pac Symp Biocomput. 2024;29:187-200.ABSTRACTMild cognitive impairment (MCI) represents the early stage of dementia including Alzheimer's disease (AD) and is a crucial stage for therapeutic interventions and treatment. Early detection of MCI offers opportunities for early intervention and significantly benefits cohort enrichment for clinical trials. Imaging and in vivo markers in plasma and cerebrospinal fluid biomarkers have high detection performance, yet their prohibitive costs and intrusiveness demand more affordable and accessible alternatives. The recent advances in digital biomarkers, especially language markers, hav...
Source: Pacific Symposium on Biocomputing - December 31, 2023 Category: Bioinformatics Authors: Bao Hoang Yijiang Pang Hiroko H Dodge Jiayu Zhou Source Type: research

Scalar-Function Causal Discovery for Generating Causal Hypotheses with Observational Wearable Device Data
Pac Symp Biocomput. 2024;29:201-213.ABSTRACTDigital health technologies such as wearable devices have transformed health data analytics, providing continuous, high-resolution functional data on various health metrics, thereby opening new avenues for innovative research. In this work, we introduce a new approach for generating causal hypotheses for a pair of a continuous functional variable (e.g., physical activities recorded over time) and a binary scalar variable (e.g., mobility condition indicator). Our method goes beyond traditional association-focused approaches and has the potential to reveal the underlying causal mec...
Source: Pacific Symposium on Biocomputing - December 31, 2023 Category: Bioinformatics Authors: Valeriya Rogovchenko Austin Sibu Yang Ni Source Type: research

FedBrain: Federated Training of Graph Neural Networks for Connectome-based Brain Imaging Analysis
Pac Symp Biocomput. 2024;29:214-225.ABSTRACTRecent advancements in neuroimaging techniques have sparked a growing interest in understanding the complex interactions between anatomical regions of interest (ROIs), forming into brain networks that play a crucial role in various clinical tasks, such as neural pattern discovery and disorder diagnosis. In recent years, graph neural networks (GNNs) have emerged as powerful tools for analyzing network data. However, due to the complexity of data acquisition and regulatory restrictions, brain network studies remain limited in scale and are often confined to local institutions. Thes...
Source: Pacific Symposium on Biocomputing - December 31, 2023 Category: Bioinformatics Authors: Yi Yang Han Xie Hejie Cui Carl Yang Source Type: research

Session Introduction: Drug-repurposing and discovery in the era of "big" real-world data: how the incorporation of observational data, genetics, and other -omic technologies can move us forward
Pac Symp Biocomput. 2024;29:226-231.ABSTRACTThis PSB 2024 session discusses the many broad biological, computational, and statistical approaches currently being used for therapeutic drug target identification and repurposing of existing treatments. Drug repurposing efforts have the potential to dramatically improve the treatment landscape by more rapidly identifying drug targets and alternative strategies for untreated or poorly managed diseases. The overarching theme for this session is the use and integration of real-world data to identify drug-disease pairs with potential therapeutic use. These drug-disease pairs may be...
Source: Pacific Symposium on Biocomputing - December 31, 2023 Category: Bioinformatics Authors: Megan M Shuey Jacklyn N Hellwege Nikhil Khankari Marijana Vujkovic Todd L Edwards Source Type: research

Systematic Estimation of Treatment Effect on Hospitalization Risk as a Drug Repurposing Screening Method
Pac Symp Biocomput. 2024;29:232-246.ABSTRACTDrug repurposing (DR) intends to identify new uses for approved medications outside their original indication. Computational methods for finding DR candidates usually rely on prior biological and chemical information on a specific drug or target but rarely utilize real-world observations. In this work, we propose a simple and effective systematic screening approach to measure medication impact on hospitalization risk based on large-scale observational data. We use common classification systems to group drugs and diseases into broader functional categories and test for non-zero ef...
Source: Pacific Symposium on Biocomputing - December 31, 2023 Category: Bioinformatics Authors: Costa Georgantas Jaume Banus Roger Hullin Jonas Richiardi Source Type: research

Transcript-aware analysis of rare predicted loss-of-function variants in the UK Biobank elucidate new isoform-trait associations
Pac Symp Biocomput. 2024;29:247-260.ABSTRACTA single gene can produce multiple transcripts with distinct molecular functions. Rare-variant association tests often aggregate all coding variants across individual genes, without accounting for the variants' presence or consequence in resulting transcript isoforms. To evaluate the utility of transcript-aware variant sets, rare predicted loss-of-function (pLOF) variants were aggregated for 17,035 protein-coding genes using 55,558 distinct transcript-specific variant sets. These sets were tested for their association with 728 circulating proteins and 188 quantitative phenotypes ...
Source: Pacific Symposium on Biocomputing - December 31, 2023 Category: Bioinformatics Authors: Rachel A Hoffing Aimee M Deaton Aaron M Holleman Lynne Krohn Philip J LoGerfo Mollie E Plekan Sebastian Akle Serrano Paul Nioi Lucas D Ward Source Type: research

Generating new drug repurposing hypotheses using disease-specific hypergraphs
Pac Symp Biocomput. 2024;29:261-275.ABSTRACTThe drug development pipeline for a new compound can last 10-20 years and cost over $10 billion. Drug repurposing offers a more time- and cost-effective alternative. Computational approaches based on network graph representations, comprising a mixture of disease nodes and their interactions, have recently yielded new drug repurposing hypotheses, including suitable candidates for COVID-19. However, these interactomes remain aggregate by design and often lack disease specificity. This dilution of information may affect the relevance of drug node embeddings to a particular disease, ...
Source: Pacific Symposium on Biocomputing - December 31, 2023 Category: Bioinformatics Authors: Ayush Jain Marie-Laure Charpignon Irene Y Chen Anthony Philippakis Ahmed Alaa Source Type: research