A few weeks ago I was working on a training run that produced garbage results.
No errors, no crashes, just a model that learned nothing. Three days later I found it. Label leakage between train and val. The model had been cheating the whole time.
So I built preflight. It's a CLI tool you run before training starts that catches the
silent stuff like NaNs, label leakage, wrong channel ordering, dead gradients, class imbalance, VRAM estimation. Ten checks total across fatal/warn/info severity tiers. Exits with code 1 on fatal failures so it can block CI.
pip install preflight-ml
preflight run --dataloader my_dataloader.py
It's very early — v0.1.1, just pushed it. I'd genuinely love feedback on what checks matter most to people, what I've missed, what's wrong with the current approach. If anyone wants to contribute a check or two that'd be even better as each one just needs a passing test, failing test, and a fix hint.
GitHub: https://github.com/Rusheel86/preflight
PyPI: https://pypi.org/project/preflight-ml/
Not trying to replace pytest or Deepchecks, just fill the gap between "my code runs" and "my training will actually work."
[link] [comments]




