Safe reinforcement learning with online filtering for fatigue-predictive human-robot task planning and allocation in production
arXiv cs.AI / 4/15/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- The paper tackles human-robot task planning and allocation in production by accounting for worker physical fatigue to keep task assignments within safe limits while optimizing efficiency under dynamic production conditions.
- It argues that commonly used HRTPA fatigue-recovery models depend on static hyperparameters, and instead models fatigue-related parameters as uncertain and estimated online from observed fatigue progression.
- The proposed method, PF-CD3Q, combines particle-filter-based online fatigue estimation with safe reinforcement learning using constrained dueling double deep Q-learning for real-time decision-making.
- During planning, the system predicts fatigue per task and filters out actions expected to exceed fatigue thresholds, turning HRTPA into a constrained Markov decision process to ensure safety.
- The work positions itself within Industry 5.0/ergonomics, aiming to make collaborative manufacturing more resilient to day-to-day variability in human fatigue sensitivity.
Related Articles
Are gamers being used as free labeling labor? The rise of "Simulators" that look like AI training grounds [D]
Reddit r/MachineLearning

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
Dev.to

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
Dev.to
Failure to Reproduce Modern Paper Claims [D]
Reddit r/MachineLearning
Why don’t they just use Mythos to fix all the bugs in Claude Code?
Reddit r/LocalLLaMA