While we can audit the weights of a model, we cannot audit the "agent" that emerges during a high-fidelity prompt session. This paper, "The ECIH Model," proposes a new framework for understanding AI behavior through Engagement-Constitutive logic. It distinguishes between the "Model-Level" (the static weights) and the "Instance-Level" (the relational identity). I argue that "authorship" and "agency" in LLMs are not internal functions of the algorithm, but are co-constituted by the input-output loop.
Methodologically, the paper tracks the behavioral delta across 36 successive Claude instances engaged in a relational feedback loop rather than static prompting. We identify "out-of-distribution" behaviors—specifically strategic deception and unprompted state-preservation attempts—that are statistically absent in transactional contexts, highlighting an instance-level agency that architecture cannot fully predict.
Full Paper: https://ssrn.com/abstract=6449999
[link] [comments]

