Bio-QSARs 2.0: Unlocking a new level of predictive power for machine learning-based ecotoxicity predictions by exploiting chemical and biological information

Environ Int. 2024 Apr 4;186:108607. doi: 10.1016/j.envint.2024.108607. Online ahead of print.ABSTRACTPractical, legal, and ethical reasons necessitate the development of methods to replace animal experiments. Computational techniques to acquire information that traditionally relied on animal testing are considered a crucial pillar among these so-called new approach methodologies. In this light, we recently introduced the Bio-QSAR concept for multispecies aquatic toxicity regression tasks. These machine learning models, trained on both chemical and biological information, are capable of both cross-chemical and cross-species predictions. Here, we significantly extend these models' applicability. This was realized by increasing the quantity of training data by a factor of approximately 20, accomplished by considering both additional chemicals and aquatic organisms. Additionally, variable test durations and associated random effects were accommodated by employing a machine learning algorithm that combines tree-boosting with mixed-effects modeling (i.e., Gaussian Process Boosting). We also explored various biological descriptors including Dynamic Energy Budget model parameters, taxonomic distances, as well as genus-specific traits and investigated the inclusion of mode-of-action information. Through these efforts, we developed Bio-QSARs for fish and aquatic invertebrates with exceptional predictive power (R squared of up to 0.92 on independent test sets). Moreover, we made conside...
Source: Environment International - Category: Environmental Health Authors: Source Type: research