AI Agent Privilege Design: Least Privilege, Sandbox, Human Approval

AI Navigate Original / 4/27/2026

💬 OpinionDeveloper Stack & InfrastructureTools & Practical Usage
共有:

Key Points

  • Agents execute LLM judgment directly; injection/hallucination cause harm
  • 3 pillars: least privilege, sandbox, human approval for irreversible ops
  • Audit-log all operations 90+ days for post-hoc tracking
  • A 2026 incident destroyed prod+backups; keep backups outside agent scope

Why Privilege Design Matters

AI agents (Claude Code, Devin, Operator, Replit Agent, etc.) are mechanisms that let external tools execute the LLM's judgment results directly. Convenient, but prompt injection or LLM hallucination can directly cause "DELETE on the production DB" or "email to all customers."

In 2025 MCP (Model Context Protocol) spread and the number of tools agents can handle exploded. That's exactly why the three pillars of least privilege, sandbox, and human approval are essential.

1. Least Privilege

Give each agent only the minimum scope needed for the task.

  • Read-only keys: SELECT-only for data-aggregation agents. INSERT/UPDATE/DELETE is a separate agent.
  • Directory restriction: coding agents can write only under a specific repo. Can't read /etc or ~/.ssh.
  • API scope: for GitHub Apps, repo:read only. OAuth carved out per user.
  • Expiry: set tokens short-lived (1-24h), rotate periodically.

Sign up to read the full article

Create a free account to access the full content of our original articles.