Energy-Aware Reinforcement Learning for Robotic Manipulation of Articulated Components in Infrastructure Operation and Maintenance

arXiv cs.RO / 3/25/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes an articulation-agnostic, energy-aware reinforcement learning framework for robots performing O&M tasks on diverse articulated infrastructure components such as doors, drawers, and valves.
  • It uses part-guided 3D perception with weighted point sampling and PointNet-based encoding to build a compact geometric representation that generalizes across heterogeneous articulated objects.
  • Manipulation is formulated as a constrained Markov Decision Process where actuation energy is explicitly included and regulated using a Lagrangian-based constrained Soft Actor-Critic training approach.
  • Experiments on representative infrastructure O&M tasks report 16–30% lower energy consumption, 16–32% fewer steps to success, and consistently high success rates, suggesting improved scalability for long-term deployment.
  • Overall, the work addresses a key limitation in prior approaches that typically ignore explicit energy constraints in multi-objective articulated manipulation for real O&M use cases.

Abstract

With the growth of intelligent civil infrastructure and smart cities, operation and maintenance (O&M) increasingly requires safe, efficient, and energy-conscious robotic manipulation of articulated components, including access doors, service drawers, and pipeline valves. However, existing robotic approaches either focus primarily on grasping or target object-specific articulated manipulation, and they rarely incorporate explicit actuation energy into multi-objective optimisation, which limits their scalability and suitability for long-term deployment in real O&M settings. Therefore, this paper proposes an articulation-agnostic and energy-aware reinforcement learning framework for robotic manipulation in intelligent infrastructure O&M. The method combines part-guided 3D perception, weighted point sampling, and PointNet-based encoding to obtain a compact geometric representation that generalises across heterogeneous articulated objects. Manipulation is formulated as a Constrained Markov Decision Process (CMDP), in which actuation energy is explicitly modelled and regulated via a Lagrangian-based constrained Soft Actor-Critic scheme. The policy is trained end-to-end under this CMDP formulation, enabling effective articulated-object operation while satisfying a long-horizon energy budget. Experiments on representative O&M tasks demonstrate 16%-30% reductions in energy consumption, 16%-32% fewer steps to success, and consistently high success rates, indicating a scalable and sustainable solution for infrastructure O&M manipulation.