Integrating machine learning models with cross-validation and bootstrapping for evaluating groundwater quality in Kanchanaburi Province, Thailand

This study aimed to explore the applicability of random forest (RF) and artificial neural networks (ANN) models to predict groundwater quality. Particularly, these two models were integrated into cross-validation (CV) and bootstrapping (B) techniques to build predictive models, including RF-CV, RF-B, ANN-CV, and ANN-B. Entropy groundwater quality index (EWQI) was converted to normalized EWQI which was then classified into five levels from very poor to very good. A total of twelve physicochemical parameters from 180 groundwater wells, including potassium, sodium, calcium, magnesium, chloride, sulfate, bicarbonate, nitrate, pH, electrical conductivity, total dissolved solids, and total hardness, were investigated to decipher groundwater quality in the eastern part of Kanchanaburi Province, Thailand. Our results indicated that groundwater quality in the study area was primarily polluted by calcium, magnesium, and bicarbonate and that the RF-CV model (RMSE = 0.06, R2 = 0.87, MAE = 0.04) outperformed the RF-B (RMSE = 0.07, R2 = 0.80, MAE = 0.04), ANN-CV (RMSE = 0.09, R2 = 0.70, MAE = 0.06), and ANN-B (RMSE = 0.10, R2 = 0.67, MAE = 0.06). Our findings highlight the superiority of the RF models over the ANN models based on the CV and B techniques. In addition, the role of groundwater parameters to the normalized EWQI in various machine learning models was found. The groundwater quality map created by the RF-CV model can be applied to orient groundwater use.PMID:38636644 | DOI:10.101...
Source: Environmental Research - Category: Environmental Health Authors: Source Type: research