Enhancing the interpretability of spatially variable N2O model predictions with soft sensors during wastewater treatment

arXiv cs.LG / 5/7/2026

📰 NewsModels & Research

Key Points

  • The study evaluates machine-learning models that predict spatially variable nitrous oxide (N2O) emissions and operational disturbances in wastewater treatment plants (WWTPs) using operational nutrient-control datasets plus dedicated N2O measurements.
  • Across four ML models, the approach can fit predicted N2O disturbances with strong performance (R² ≈ 0.79–0.89) and shows high accuracy when monitoring campaigns are simulated (0.97 ± 0.02, n = 80).
  • Although predictive accuracy is high, feature importance and interpretability vary depending on the model, the simulation scenario, and the N2O measurement scale (reactor-level vs. WWTP-level).
  • The authors argue that “soft sensor” model predictions are constrained by the measurement location and dataset uncertainty, which can limit how confidently results can be interpreted.
  • Using the structure of a plant-wide mechanistic model, the analysis identifies interactions between autotrophic and heterotrophic pathways over nitric oxide that may overestimate aerobic nitrite production and bias estimated contributions to the N2O pathway.

Abstract

Model-based solutions for nitrous oxide (N2O) emissions from wastewater treatment plants (WWTP) are informed by operational datasets designed to control nutrient levels in liquid waste, coupled with dedicated campaigns for N2O measurements. We analysed how machine learning (ML) models predict disturbances to WWT operation and spatially variable N2O emissions. A real dataset was investigated to validate the modelling framework from N2O emissions predicted by four ML models (R2 = 0.79 - 0.89). Monitoring campaigns for N2O were simulated with a plant-wide mechanistic model to include additional sensors, site-level N2O datasets, and wastewater disturbances (n = 16). ML models were highly accurate (0.97 +- 0.02, n = 80), but the feature importance depended on the model, the scenario and the N2O measurement scale (reactor vs. WWTP). We argue that N2O soft sensor model predictions are limited to the measuring location and the methodological uncertainty of the dataset, which affect the interpretability of the model. Lastly, the analysis of the mechanistic model structure exposed interactions between autotrophic and heterotrophic pathways over nitric oxide which can overestimate aerobic nitrite production and bias the N2O pathway contributions.