| I built a Streamlit-based AI data analysis tool that: • Fills missing values using ML models (not just mean/median) • Predicts any missing column using n-1 inputs • Detects anomalies • Shows correlations and feature importance • Lets you download the updated dataset (Attached images show the UI and before vs after CSV file with a sample CSV available on the GitHub page, as well as an image showing the achieved performance metrics) I wanted to test how well it works on real-world incomplete datasets. Would love feedback on: - model approach - accuracy issues - any improvements I should make GitHub: https://github.com/WALKER00058/ML-data-analysis/tree/main [link] [comments] |
Built an AI tool that cleans datasets, fills missing values, and predicts unknown fields [P]
Reddit r/MachineLearning / 4/14/2026
💬 OpinionSignals & Early TrendsTools & Practical Usage
Key Points
- The post describes a Streamlit-based AI tool for real-world dataset cleanup that fills missing values using machine learning models rather than simple imputation methods.
- It can infer/predict missing values for an entire column based on the other available columns (using n-1 inputs).
- The tool also includes anomaly detection plus correlation and feature-importance reporting to help users understand data quality and drivers.
- Users can review UI screenshots and compare before/after CSV outputs, and download the cleaned dataset produced by the tool.
- The author shares the project on GitHub and asks for feedback on the modeling approach and accuracy, inviting suggestions for improvements.


