Aquaculture water quality exhibits time-dependent dynamics, making accurate short-term forecasting essential for proactive farm management. This study presents a leakage-safe and interpretable machine-learning framework for forecasting total dissolved solids (TDS) from high-frequency aquaculture sensor time series. The proposed approach integrates lag-based feature engineering with an expanding-window walk-forward validation protocol (19 folds) to ensure realistic time-forward evaluation and to avoid information leakage. Under a leakage-safe lag-only specification that excludes redundant conductivity predictors, tree-based ensemble learning emerged as the most robust solution. XGBoost achieved the highest forecasting accuracy, yielding a mean MAE of 0.314 ± 0.482 mg/L and RMSE of 1.596 ± 4.206 mg/L across walk-forward folds. Residual diagnostics based on ACF/PACF and Ljung–Box testing indicated no significant remaining autocorrelation, confirming that predictive skill is not driven by residual serial dependence. SHAP-based interpretation revealed that TDS dynamics are primarily governed by ionic-strength-related signals, whereas temperature and pH contribute marginally. By combining leakage-safe validation, ensemble forecasting, and explainable inference, this work advances an operational early-warning and decision-support framework for sustainable aquaculture water-quality management.
Operational early-warning forecasts of aquaculture water quality: Interpretable ML for TDS under walk-forward validation / M.A.A.M. Hridoy, M.B.. - In: ECOLOGICAL INFORMATICS. - ISSN 1574-9541. - 96:(2026 Jun 02), pp. 103804.1-103804.15. [10.1016/j.ecoinf.2026.103804]
Operational early-warning forecasts of aquaculture water quality: Interpretable ML for TDS under walk-forward validation
M. Bodini
Secondo
;
2026
Abstract
Aquaculture water quality exhibits time-dependent dynamics, making accurate short-term forecasting essential for proactive farm management. This study presents a leakage-safe and interpretable machine-learning framework for forecasting total dissolved solids (TDS) from high-frequency aquaculture sensor time series. The proposed approach integrates lag-based feature engineering with an expanding-window walk-forward validation protocol (19 folds) to ensure realistic time-forward evaluation and to avoid information leakage. Under a leakage-safe lag-only specification that excludes redundant conductivity predictors, tree-based ensemble learning emerged as the most robust solution. XGBoost achieved the highest forecasting accuracy, yielding a mean MAE of 0.314 ± 0.482 mg/L and RMSE of 1.596 ± 4.206 mg/L across walk-forward folds. Residual diagnostics based on ACF/PACF and Ljung–Box testing indicated no significant remaining autocorrelation, confirming that predictive skill is not driven by residual serial dependence. SHAP-based interpretation revealed that TDS dynamics are primarily governed by ionic-strength-related signals, whereas temperature and pH contribute marginally. By combining leakage-safe validation, ensemble forecasting, and explainable inference, this work advances an operational early-warning and decision-support framework for sustainable aquaculture water-quality management.| File | Dimensione | Formato | |
|---|---|---|---|
|
main.pdf
accesso aperto
Descrizione: Versione disponibile online
Tipologia:
Publisher's version/PDF
Licenza:
Creative commons
Dimensione
8.16 MB
Formato
Adobe PDF
|
8.16 MB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.




