What if Pinocchio Were a Reinforcement Learning Agent: A Normative End-to-End Pipeline
arXiv cs.AI / 3/18/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- The paper proposes Pino, a hybrid model in which reinforcement learning agents are supervised by argumentation-based normative advisors to achieve norm compliance and context awareness.
- It builds on AJAR, Jiminy, and NGRL architectures and introduces a novel algorithm for automatically extracting the arguments and relationships that underlie the advisors' decisions.
- The work investigates norm avoidance in reinforcement learning and provides a mitigation strategy within the proposed pipeline.
- Each component of the pipeline is empirically evaluated, and the work discusses limitations and directions for future research.



