Peoples Water Data: Enabling Reliable Field Data Generation and Microbial Contamination Screening in Household Drinking Water
arXiv cs.LG / 4/7/2026
📰 NewsSignals & Early TrendsIdeas & Deep AnalysisTools & Practical UsageModels & Research
Key Points
- The paper introduces a two-stage machine-learning framework to predict E. coli presence in household point-of-use drinking water in Chennai, India using low-cost physicochemical and contextual indicators rather than inaccessible lab testing at scale.
- It analyzes data from the Peoples Water Data initiative, using 3,023 field samples that are harmonized and quality-controlled to retain 2,207 valid samples for modeling.
- The resulting decision-support approach is designed to help prioritize which households should receive microbiological testing in resource-constrained settings, addressing a gap in routine point-of-use contamination assessment.
- The study is also implemented within an AI-supported field framework that includes student-facing guidance and real-time quality control to improve adherence, traceability, and reliability of collected water data.



