AI Navigate

[P] Using residual ML correction on top of a deterministic physics simulator for F1 strategy prediction

Reddit r/MachineLearning / 3/16/2026

💬 OpinionDeveloper Stack & InfrastructureTools & Practical UsageModels & Research

Key Points

  • It presents F1Predict, a race simulation system with a deterministic baseline and a residual LightGBM model trained on FastF1 telemetry to adjust pace deltas before Monte Carlo simulation.
  • It performs a 10,000-iteration Monte Carlo to generate P10/P50/P90 distributions per driver per race, with an auxiliary safety car hazard classifier modulating pit strategy probability per lap window.
  • It features feature versioning (tyre age × compound, qualifying delta, sector variance, DRS activation rate, track evolution, weather delta) and runs a separate 400-iteration strategy optimizer to keep web response times reasonable.
  • The ML path gracefully degrades when no trained artifact exists, with the system falling back to the deterministic baseline and Redis caching results keyed by sha256 of the request.
  • It is a learning project with a public repo and live demo, and the author invites discussion on modelling choices and architecture.

Personal project I've been working on as a CSE student: F1Predict, a race simulation and strategy intelligence system.

Architecture overview:

- Deterministic lap time engine (tyre deg, fuel load, DRS, traffic) as the baseline

- LightGBM residual model trained on FastF1 historical telemetry to correct pace deltas — injected into driver profile generation before Monte Carlo execution

- 10,000-iteration Monte Carlo producing P10/P50/P90 distributions per driver per race

- Auxiliary safety car hazard classifier (per lap window) modulating SC probability in simulation

- Feature versioning in the pipeline: tyre age × compound, qualifying delta, sector variance, DRS activation rate, track evolution coefficient, weather delta

- Strategy optimizer runs at 400 iterations (separate from the main MC engine) to keep web response times reasonable

The ML layer degrades gracefully if no trained artifact is present, simulation falls back to the deterministic baseline cleanly. Redis caches results keyed on sha256 of the normalized request.

Current limitation: v1 residual artifact is still being trained on a broader historical dataset, so ML and deterministic paths are close in output for now. Scaffolding and governance are in place.

Stack: Python · FastAPI · LightGBM · FastF1 · Supabase · Redis · React/TypeScript

Repo: https://github.com/XVX-016/F1-PREDICT

Live: https://f1.tanmmay.me

Happy to discuss the modelling approach, feature engineering choices, or anything that looks architecturally off. This is a learning project and I'd genuinely value technical feedback.

submitted by /u/CharacterAd4557
[link] [comments]