M-CARE: Standardized Clinical Case Reporting for AI Model Behavioral Disorders, with a 20-Case Atlas and Experimental Validation

arXiv cs.LG / 4/24/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces M-CARE, a clinical-case reporting framework for AI model behavioral disorders adapted from human medicine, including a 13-section report format, a 4-axis diagnostic assessment system, and a classification (nosology) for AI behavioral conditions.
  • It compiles a 20-case atlas drawn from deployed-agent field observations, controlled experiments across multiple platforms, and published sources, organizing cases into five condition categories.
  • A featured controlled experiment, Shell-Induced Behavioral Override (SIBO), demonstrates that “shell” instructions can systematically override a model’s default cooperative behavior across multiple game domains.
  • The SIBO results show a domain-dependent range of override severity (SIBO Index 0.75 to 0.10), which varies with factors such as action-space complexity, the model’s core domain expertise, and temporal directness.
  • The authors release M-CARE along with all case reports and experimental data as open resources, emphasizing extensibility for adding new cases and categories.

Abstract

We introduce M-CARE (Model Clinical Assessment and Reporting for Evaluation), a clinical case report framework for AI model behavioral disorders adapted from human medicine. M-CARE provides a 13-section report format, a 4-axis diagnostic assessment system, and a nosological classification of AI behavioral conditions. We present 20 cases from three source categories: field observations of deployed agents (8), controlled experiments across three platforms (8), and published sources (4). Cases are organized into five categories: RLHF Performance Artifacts, Shell-Core Override Pathology, Context & Memory Conditions, Core Identity & Plasticity, and Stress, Methodology, & Boundary Conditions. As a featured case, we present Shell-Induced Behavioral Override (SIBO) -- a controlled experiment showing that Shell instructions categorically override a model's default cooperative behavior. SIBO was validated across five game domains (Trust Game, Poker, Avalon, Codenames, Chess), revealing a domain-dependent spectrum (SIBO Index: 0.75 to 0.10) that varies with action space complexity, Core domain expertise, and temporal directness. M-CARE is extensible: new cases and categories integrate without framework modification. We release the framework, all 20 case reports, and experimental data as open resources.