M-CARE: Standardized Clinical Case Reporting for AI Model Behavioral Disorders, with a 20-Case Atlas and Experimental Validation

arXiv cs.LG / 4/24/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces M-CARE, a clinical-case reporting framework for AI model behavioral disorders adapted from human medicine, including a 13-section report format, a 4-axis diagnostic assessment system, and a classification (nosology) for AI behavioral conditions.
It compiles a 20-case atlas drawn from deployed-agent field observations, controlled experiments across multiple platforms, and published sources, organizing cases into five condition categories.
A featured controlled experiment, Shell-Induced Behavioral Override (SIBO), demonstrates that “shell” instructions can systematically override a model’s default cooperative behavior across multiple game domains.
The SIBO results show a domain-dependent range of override severity (SIBO Index 0.75 to 0.10), which varies with factors such as action-space complexity, the model’s core domain expertise, and temporal directness.
The authors release M-CARE along with all case reports and experimental data as open resources, emphasizing extensibility for adding new cases and categories.

Abstract

We introduce M-CARE (Model Clinical Assessment and Reporting for Evaluation), a clinical case report framework for AI model behavioral disorders adapted from human medicine. M-CARE provides a 13-section report format, a 4-axis diagnostic assessment system, and a nosological classification of AI behavioral conditions. We present 20 cases from three source categories: field observations of deployed agents (8), controlled experiments across three platforms (8), and published sources (4). Cases are organized into five categories: RLHF Performance Artifacts, Shell-Core Override Pathology, Context & Memory Conditions, Core Identity & Plasticity, and Stress, Methodology, & Boundary Conditions. As a featured case, we present Shell-Induced Behavioral Override (SIBO) -- a controlled experiment showing that Shell instructions categorically override a model's default cooperative behavior. SIBO was validated across five game domains (Trust Game, Poker, Avalon, Codenames, Chess), revealing a domain-dependent spectrum (SIBO Index: 0.75 to 0.10) that varies with action space complexity, Core domain expertise, and temporal directness. M-CARE is extensible: new cases and categories integrate without framework modification. We release the framework, all 20 case reports, and experimental data as open resources.

The 67th Attempt: When Your "Knowledge Management" System Becomes a Self-Fulfilling Prophecy of Excellence

Dev.to

Context Engineering for Developers: A Practical Guide (2026)

Dev.to

GPT-5.5 is here. So is DeepSeek V4. And honestly, I am tired of version numbers.

Dev.to

AI Visibility Tracking Exploded in 2026: 6 Tools Every Brand Needs Now

Dev.to

I Built an AI Image Workflow with GPT Image 2.0 (+ Fixing Its Biggest Flaw)

Dev.to

M-CARE: Standardized Clinical Case Reporting for AI Model Behavioral Disorders, with a 20-Case Atlas and Experimental Validation

Key Points

Abstract

Related Articles

The 67th Attempt: When Your "Knowledge Management" System Becomes a Self-Fulfilling Prophecy of Excellence

Context Engineering for Developers: A Practical Guide (2026)

GPT-5.5 is here. So is DeepSeek V4. And honestly, I am tired of version numbers.

AI Visibility Tracking Exploded in 2026: 6 Tools Every Brand Needs Now

I Built an AI Image Workflow with GPT Image 2.0 (+ Fixing Its Biggest Flaw)

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer