What if Pinocchio Were a Reinforcement Learning Agent: A Normative End-to-End Pipeline

arXiv cs.AI / 3/18/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper proposes Pino, a hybrid model in which reinforcement learning agents are supervised by argumentation-based normative advisors to achieve norm compliance and context awareness.
It builds on AJAR, Jiminy, and NGRL architectures and introduces a novel algorithm for automatically extracting the arguments and relationships that underlie the advisors' decisions.
The work investigates norm avoidance in reinforcement learning and provides a mitigation strategy within the proposed pipeline.
Each component of the pipeline is empirically evaluated, and the work discusses limitations and directions for future research.

Abstract

In the past decade, artificial intelligence (AI) has developed quickly. With this rapid progression came the need for systems capable of complying with the rules and norms of our society so that they can be successfully and safely integrated into our daily lives. Inspired by the story of Pinocchio in ``Le avventure di Pinocchio - Storia di un burattino'', this thesis proposes a pipeline that addresses the problem of developing norm compliant and context-aware agents. Building on the AJAR, Jiminy, and NGRL architectures, the work introduces \pino, a hybrid model in which reinforcement learning agents are supervised by argumentation-based normative advisors. In order to make this pipeline operational, this thesis also presents a novel algorithm for automatically extracting the arguments and relationships that underlie the advisors' decisions. Finally, this thesis investigates the phenomenon of \textit{norm avoidance}, providing a definition and a mitigation strategy within the context of reinforcement learning agents. Each component of the pipeline is empirically evaluated. The thesis concludes with a discussion of related work, current limitations, and directions for future research.

【無料版】まじん式 v4

note

【無料版】まじん式 v4

note

再現性とは何か | おじの解説 | 📗 AIを組織で回す技術 013

note

AIに聞く前に「自分の心」に聞け。40代がターゲットの「本当の痛み」を見抜く方法。

note

Gemini 3.0最新モデルの衝撃性能：ビジネスと開発を加速させるAIの進化を徹底解説

note

What if Pinocchio Were a Reinforcement Learning Agent: A Normative End-to-End Pipeline

Key Points

Abstract

Related Articles

【無料版】まじん式 v4

【無料版】まじん式 v4

再現性とは何か | おじの解説 | 📗 AIを組織で回す技術 013

AIに聞く前に「自分の心」に聞け。40代がターゲットの「本当の痛み」を見抜く方法。

Gemini 3.0最新モデルの衝撃性能：ビジネスと開発を加速させるAIの進化を徹底解説

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer