Is it actually possible to define a persistent, model-agnostic text-based layer (loaded with the model each time) that keeps an AI system behaviorally consistent across time? I don’t mean just a typical system prompt, but something more structured that constrains how the system resolves conflicts, prioritizes things, and makes decisions even under things like context drift, conflicting instructions, or prompt injection.
Right now it feels like most consistency comes from training or the model itself, so I’m wondering if there’s a fundamental reason a separate layer like this wouldn’t hold up in practice.
[link] [comments]
