AI Navigate

[P] preflight, a pre-training validator for PyTorch I built after losing 3 days to label leakage

Reddit r/MachineLearning / 3/15/2026

📰 NewsDeveloper Stack & InfrastructureTools & Practical Usage

Key Points

  • The author built preflight, a CLI tool that runs before training to catch common silent issues that sabotage model performance.
  • It includes ten checks across fatal, warn, info severities, and exits with code 1 on fatal failures to block CI.
  • The tool targets issues like label leakage, NaNs, wrong channel ordering, dead gradients, class imbalance, and VRAM estimation.
  • The project is in early release (v0.1.1) with GitHub and PyPI pages, and the author is seeking feedback and contributions.

A few weeks ago I was working on a training run that produced garbage results.

No errors, no crashes, just a model that learned nothing. Three days later I found it. Label leakage between train and val. The model had been cheating the whole time.

So I built preflight. It's a CLI tool you run before training starts that catches the

silent stuff like NaNs, label leakage, wrong channel ordering, dead gradients, class imbalance, VRAM estimation. Ten checks total across fatal/warn/info severity tiers. Exits with code 1 on fatal failures so it can block CI.

pip install preflight-ml

preflight run --dataloader my_dataloader.py

It's very early — v0.1.1, just pushed it. I'd genuinely love feedback on what checks matter most to people, what I've missed, what's wrong with the current approach. If anyone wants to contribute a check or two that'd be even better as each one just needs a passing test, failing test, and a fix hint.

GitHub: https://github.com/Rusheel86/preflight

PyPI: https://pypi.org/project/preflight-ml/

Not trying to replace pytest or Deepchecks, just fill the gap between "my code runs" and "my training will actually work."

submitted by /u/Red_Egnival
[link] [comments]