Safe reinforcement learning with online filtering for fatigue-predictive human-robot task planning and allocation in production

arXiv cs.AI / 4/15/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper tackles human-robot task planning and allocation in production by accounting for worker physical fatigue to keep task assignments within safe limits while optimizing efficiency under dynamic production conditions.
It argues that commonly used HRTPA fatigue-recovery models depend on static hyperparameters, and instead models fatigue-related parameters as uncertain and estimated online from observed fatigue progression.
The proposed method, PF-CD3Q, combines particle-filter-based online fatigue estimation with safe reinforcement learning using constrained dueling double deep Q-learning for real-time decision-making.
During planning, the system predicts fatigue per task and filters out actions expected to exceed fatigue thresholds, turning HRTPA into a constrained Markov decision process to ensure safety.
The work positions itself within Industry 5.0/ergonomics, aiming to make collaborative manufacturing more resilient to day-to-day variability in human fatigue sensitivity.

Abstract

Human-robot collaborative manufacturing, a core aspect of Industry 5.0, emphasizes ergonomics to enhance worker well-being. This paper addresses the dynamic human-robot task planning and allocation (HRTPA) problem, which involves determining when to perform tasks and who should execute them to maximize efficiency while ensuring workers' physical fatigue remains within safe limits. The inclusion of fatigue constraints, combined with production dynamics, significantly increases the complexity of the HRTPA problem. Traditional fatigue-recovery models in HRTPA often rely on static, predefined hyperparameters. However, in practice, human fatigue sensitivity varies daily due to factors such as changed work conditions and insufficient sleep. To better capture this uncertainty, we treat fatigue-related parameters as inaccurate and estimate them online based on observed fatigue progression during production. To address these challenges, we propose PF-CD3Q, a safe reinforcement learning (safe RL) approach that integrates the particle filter with constrained dueling double deep Q-learning for real-time fatigue-predictive HRTPA. Specifically, we first develop PF-based estimators to track human fatigue and update fatigue model parameters in real-time. These estimators are then integrated into CD3Q by making task-level fatigue predictions during decision-making and excluding tasks that exceed fatigue limits, thereby constraining the action space and formulating the problem as a constrained Markov decision process (CMDP).

Are gamers being used as free labeling labor? The rise of "Simulators" that look like AI training grounds [D]

Reddit r/MachineLearning

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

Failure to Reproduce Modern Paper Claims [D]

Reddit r/MachineLearning

Why don’t they just use Mythos to fix all the bugs in Claude Code?

Reddit r/LocalLLaMA

Safe reinforcement learning with online filtering for fatigue-predictive human-robot task planning and allocation in production

Key Points

Abstract

Related Articles

Are gamers being used as free labeling labor? The rise of "Simulators" that look like AI training grounds [D]

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Failure to Reproduce Modern Paper Claims [D]

Why don’t they just use Mythos to fix all the bugs in Claude Code?

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer