HPO - hyperparameter drift [D]

Reddit r/MachineLearning / 4/25/2026

💬 OpinionDeveloper Stack & InfrastructureIdeas & Deep AnalysisModels & Research

Key Points

  • The author is facing a practical challenge running hyperparameter optimization (HPO) for very large ML models that take about a full day to train to completion.
  • They shorten HPO trials by reducing epochs (from full-training settings down to ~1–2 hours per trial, and even under 30 minutes with pruning) and worry this causes “hyperparameter drift” because schedules and other HPO-related parameters may not work well for full training.
  • They are managing frequent retraining needs (about twice a month) across five different models, plus periodic architecture changes that require fresh HPO runs, making full training for every HPO trial infeasible.
  • They ask how teams typically avoid parameter drift between short HPO runs and full training runs, and whether pruning methods (specifically a median pruner) bias results toward models that converge quickly rather than those that reach better final performance.
  • They propose restarting the learning-rate scheduler after it appears to stop learning and seek advice on whether that would address the learning-rate scheduling issue.

Hey all, so I am running into a problem. I am training massive ML models which take literally a day to fully train.

We want to run HPO to make it so that we can get the best parameters for the model and we require very high accuracy for the task so we need the HPO step.

Because the model takes a day to fully train, we reduced the number of epochs for the HPO part to take around 1 to 2 hours for each hPo trial.

With pruning we can get to under 30 minutes per. Now the thing is that we want to get these models and HPO trained about twice a month so I can’t be doing full training runs on the HPO and also we have 5 different models that we need to train and keep up to date.

We also change model architecture periodically so we need to do fresh hPo runs on those.

The main issue I am running into is that by reducing the HPO epochs below what is used for the full training runs, I fear my learning rate scheduler and other HPO params may be poorly optimized for a full training run.

How do you manage these massive training runs with HPO and ensure no parameter drift when needing to do a full training run vs small HPO run.

Also last question is does pruning reward model for converging fast and punish models that may converge closer to truth but slower. Because we prune with median pruner and I’m finding most models converge fast but don’t learn anything past a certain point.

I’m considering to restart my LR scheduler from the start after it stops learning and then this may help fix LR problem. Similar to early stopping but to start LR back up again when this happens. What do you think??

submitted by /u/Counter-Business
[link] [comments]