Persist session state with filesystem configuration and execute shell commands

Amazon AWS AI Blog / 4/2/2026

💬 OpinionDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical Usage

共有:

Key Points

Amazon Bedrock AgentCore Runtime introduces (public preview) managed session storage to persist an agent’s filesystem state across session restarts, addressing the ephemerality of microVM storage.
It also adds an ability to execute shell commands via InvokeAgentRuntimeCommand, running deterministic operations (e.g., npm test, git push) inside the microVM for each active agent session.
The combination of persistent filesystem “working memory” and in-environment command execution enables agent workflows that were previously difficult or required fragile LLM-driven workarounds or custom tooling.
The article explains AgentCore’s microVM-per-session model, including how sessions boot with a clean filesystem and how session termination (stop or idle timeout) normally causes all created artifacts to disappear.
It provides a walkthrough-style guide for using managed session storage and for executing commands within the agent environment.

AI agents have evolved significantly beyond chat. Writing code, persist filesystem state, execute shell commands, and managing states throughout the filesystem are some examples of things that they can do. As agentic coding assistants and development workflows have matured, the filesystem has become agents’ primary working memory, extending their capabilities beyond the context window. This shift creates two challenges that every team that’s building production agents runs into:

The filesystem is ephemeral. When your agent’s session stops, everything that it created, like the installed dependencies, the generated code, or the local git history disappears.
When your workflow needs a deterministic operation like npm test or git push, you’re forced to route it through the large language model (LLM) or build custom tooling outside the runtime. Neither option is good.

Amazon Bedrock AgentCore Runtime now addresses both challenges with two capabilities: managed session storage for persistent agent filesystem state (public preview) and execute command (InvokeAgentRuntimeCommand) for running shell commands directly inside the microVM associated with each active agent session. Each of them is useful on its own. Together, they unlock workflows that weren’t possible before.

In this post, we go through how to use managed session storage to persist your agent’s filesystem state and how to execute shell commands directly in your agent’s environment.

Inside an AgentCore Runtime session

AgentCore Runtime runs each session in a dedicated microVM with isolated resources, including its own kernel, memory, and filesystem. This architecture provides strong security boundaries, but it also means that every session boots into a clean filesystem. When the microVM terminates, whether through explicit stop or idle timeout, everything that the agent created disappears.

Think about what that means in practice. Your coding agent spends twenty minutes scaffolding a project: setting up directory structures, installing dependencies, generating boilerplate code, configuring build tooling. You step away for lunch, and when you come back and invoke the same session, the agent starts from scratch. Every package re-installed, every file re-generated. Twenty minutes of compute burned before the agent can do useful work again. This limitation can be addressed by writing checkpoint logic to upload files to Amazon Simple Storage Service (Amazon S3) before stopping a session and downloading them on resume or to keep sessions alive to avoid losing state. This workaround can work, but it doesn’t address the limitations at the filesystem level and complexity is being added into agent code.

The same friction exists for deterministic operations. When the agent finishes a fix and you need to run tests, routing the command through the LLM as a tool call adds token cost, latency, and non-determinism to a predictable operation Another option is to build a separate orchestration logic outside of the runtime, which requires you to connect to the agent’s filesystem, adding complexity.

Managed session storage (public preview): state that survives

The first challenge, ephemeral filesystems, is addressed by managed session storage. It gives your agent a persistent directory that survives stop/resume cycles. Persistence is built into the runtime and configured at agent creation—everything written to that directory survives even when the compute environment is replaced.

Configuring persistent storage

To configure persistent storage, do the following:

Add sessionStorage to your agent runtime’s filesystemConfiguration:

aws bedrock-agentcore create-agent-runtime \
  --agent-runtime-name "coding-agent" \
  --role-arn "arn:aws:iam::111122223333:role/AgentExecutionRole" \
  --agent-runtime-artifact '{"containerConfiguration": {
    "containerUri": "123456789012.dkr.ecr.us-west-2.amazonaws.com/my-agent:latest"
  }}' \
  --filesystem-configurations '[{
    "sessionStorage": {
      "mountPath": "/mnt/workspace"
    }
  }]'

Or using the AWS SDK for Python (Boto3):

import boto3

# Use the control plane client for creating and managing runtimes
control_client = boto3.client('bedrock-agentcore-control', region_name='us-west-2')

response = control_client.create_agent_runtime(
    agentRuntimeName='coding-agent',
    agentRuntimeArtifact={
        'containerConfiguration': {
            'containerUri': '123456789012.dkr.ecr.us-west-2.amazonaws.com/my-agent:latest'
        }
    },
    roleArn='arn:aws:iam::111122223333:role/AgentExecutionRole',
    protocolConfiguration={
        'serverProtocol': 'HTTP'
    },
    networkConfiguration={
        'networkMode': 'PUBLIC'
    },
    filesystemConfigurations=[{
        'sessionStorage': {
            'mountPath': '/mnt/workspace'
        }
    }]
)

Note: AgentCore uses two Boto3 service clients. The control plane client (bedrock-agentcore-control) handles runtime lifecycle operations like CreateAgentRuntime, GetAgentRuntime, and DeleteAgentRuntime. The data plane client (bedrock-agentcore) handles session operations like InvokeAgentRuntime and InvokeAgentRuntimeCommand. The mount path must start with /mnt followed by a folder name (for example, /mnt/workspace or /mnt/data). After configured, any file that your agent writes to this path is automatically persisted to managed storage.

The stop/resume experience

You invoke your agent and ask it to set up a project:

aws bedrock-agentcore invoke-agent-runtime \
  --agent-runtime-arn "arn:...:agent-runtime/coding-agent" \
  --runtime-session-id "session-001" \
  --payload '{"prompt": "Set up the project and install dependencies in /mnt/workspace"}'

The agent downloads the code, installs the packages and generates the configuration in the microVM that’s dedicated to that session. Then, you stop the session or the idle timeout kicks in, and the microVM terminates.

You come back and invoke with the same runtime-session-id:

aws bedrock-agentcore invoke-agent-runtime \
  --agent-runtime-arn "arn:...:agent-runtime/coding-agent" \
  --runtime-session-id "session-001" \
  --payload '{"prompt": "Run the tests and fix any failures"}'

A new compute environment (microVM) spins up and mounts the same storage. The agent sees /mnt/workspace exactly as it left it, including source files, node_modules, build artifacts and .git history. The agent picks up mid-thought, without the need to re-install and re-generate.

From the agent’s perspective, nothing special is happening. It reads and writes files to a directory like it normally would. Your agent code doesn’t need to change—no special APIs, no save/restore logic, no serialization. Write a file to /mnt/workspace, stop the session, resume it, and the file is there.

The session’s compute environment (microVM) from yesterday is gone, but the filesystem survived.

Controlling how long the data lives

By default, session storage data is retained for 14 days of idle time. If the session isn’t resumed within this window, the data is cleaned up. When the agent endpoint is updated to a different version and the same runtime-session-id is invoked, the session data is refreshed. This gives the mounted directory a clean context for the new version.

A multi-day development workflow

Let’s walk through what this looks like in practice. Day 1 – You invoke your coding agent and ask it to download a code base, inspect the files, and set up the development environment:

aws bedrock-agentcore invoke-agent-runtime \
  --agent-runtime-arn "arn:...:agent-runtime/coding-agent" \
  --runtime-session-id "fefc1779-e5e7-49cf-a2c4-abaf478680c4" \
  --payload '{"prompt": "Download the code from s3://amzn-s3-demo-bucket/fastapi-demo-main.zip and list all files"}'

The agent downloads the repository to /mnt/workspace, extracts it, and reports back:

Files in the fastapi-demo-main project:
- Dockerfile
- README.md
- main.py
- requirements.txt

You close your laptop and go home. Day 2 – You invoke with the same session ID:

aws bedrock-agentcore invoke-agent-runtime \
  --agent-runtime-arn "arn:...:agent-runtime/coding-agent" \
  --runtime-session-id "fefc1779-e5e7-49cf-a2c4-abaf478680c4" \
  --payload '{"prompt": "Add a new function called hello_world to main.py"}'

The agent sees the project exactly as it left it. It modifies main.py directly. No re-downloading, no re-extracting. When you ask the agent to list the files, everything is there, including the modified main.py with the new hello_world function. The compute environment (microVM) from yesterday has already been terminated, but the work persists.

That addresses the first challenge, but now your agent has written new code and you need to verify that it works. This is where the second capability comes in.

Execute shell command: deterministic operations, directly in the agent’s environment

The second challenge, running deterministic operations without routing them through the LLM, is addressed by InvokeAgentRuntimeCommand. You can execute shell commands directly inside a running AgentCore Runtime session and stream the output back over HTTP/2.

The key insight is that agents and shell commands are good at different things:

Use execute command	Use the agent
The operation has a known command (`npm test`, `git push`)	The operation requires reasoning (“analyze this code and fix the bug”)
You want deterministic execution—same command, same result	You want the LLM to decide what to do
You need streaming output from a long-running process	You need the agent to use tools in a loop
The operation is a validation gate in your workflow	The operation is the creative or analytical work
You’re bootstrapping the environment before the agent starts	You’re asking the agent to work on a task

When your agent finishes writing code and you need to run tests, you shouldn’t need the LLM for that. npm test is npm test. The command is known, the behavior should be deterministic, and you want the raw output, not the LLM’s interpretation of it.

Running a command

Execute a command using the AWS SDK for Python (Boto3):

import boto3
import sys

client = boto3.client('bedrock-agentcore', region_name='us-west-2')

response = client.invoke_agent_runtime_command(
    agentRuntimeArn='arn:aws:bedrock-agentcore:us-west-2:111122223333:runtime/my-agent',
    runtimeSessionId='session-id-at-least-33-characters-long',
    body={
        'command': '/bin/bash -c "npm test"',
        'timeout': 60
    }
)

for event in response['stream']:
    if 'chunk' in event:
        chunk = event['chunk']

        if 'contentStart' in chunk:
            print("Command execution started")

        if 'contentDelta' in chunk:
            delta = chunk['contentDelta']
            if delta.get('stdout'):
                print(delta['stdout'], end='')
            if delta.get('stderr'):
                print(delta['stderr'], end='', file=sys.stderr)

        if 'contentStop' in chunk:
            stop = chunk['contentStop']
            print(f"
Exit code: {stop.get('exitCode')}, Status: {stop.get('status')}")

The response streams three event types in real time:

Event	When	Contains
`contentStart`	First chunk	Confirms the command started
`contentDelta`	During execution	`stdout` and/or `stderr` output
`contentStop`	Last chunk	`exitCode` and `status` (`COMPLETED` or `TIMED_OUT`)

As the output is streamed as it’s produced, you can detect a failure in the first few seconds and react immediately, rather than waiting for the full run.

Container, same filesystem

This is the critical detail: commands run in the same container, filesystem, and environment as your agent, not a sidecar, or a separate process talking over a socket. A file that the agent wrote at /mnt/workspace/fix.py is immediately visible to a command running cat /mnt/workspace/fix.py. There’s no synchronization step, no file transfer, and no shared volume to configure.

The AgentCore Runtime microVM doesn’t include developer tools by default. This also means that any tools that your commands depend on, git, npm, or language runtimes, must be added in your container image or installed dynamically at runtime.

Design choices that shape how you use it

One-shot execution. Each command spawns a new bash process, runs to completion (or time out), and returns. No persistent shell session between commands. This matches how agent frameworks use command execution, craft a command, run it, read the output, and decide what to do next.
Non-blocking. Command execution doesn’t block agent invocations. You can invoke the agent and run commands concurrently on the same session.
Stateless between commands. Each command starts fresh, there’s no shell history and environment variables from previous commands don’t carry over. If you need state, encode it in the command: cd /workspace && export NODE_ENV=test && npm test.

What people are building with it

Test automation — After the agent writes code, run npm test or pytest as a command. Stream the output and feed specific failures back to the agent for iteration.
Git workflows — Branching, committing, and pushing are deterministic. Run them as commands, keeping version control logic out of the LLM.
Environment bootstrapping — Clone repos, install packages, set up build tooling before the agent starts. This will be faster and more reliable as direct commands.
Build pipelines — Anything with a known command that should run exactly as specified, for example: cargo build --release, mvn package, go build.
Validation gates — Run linters, type checkers, security scanners as a gate after the agent writes code, but before committing.
Debugging — Inspect the runtime environment: check installed packages, disk usage, and running processes. These will be useful for understanding agent failures.

Better together: the filesystem is the shared context

Managed session storage (in public preview) addresses the ephemeral filesystem challenge. Execute command addresses the deterministic operations challenge. Each is valuable on its own, but they’re more powerful when combined because they share the same filesystem that ties the entire workflow together.

When your agent runtime has managed session storage configured at /mnt/workspace, everything operates on the same persistent directory:

InvokeAgentRuntime writes code, generates artifacts, and manages files in /mnt/workspace.
InvokeAgentRuntimeCommand runs tests, git operations, and builds reading from and writing to the same /mnt/workspace.
Stop the session. Compute (microVM) spins down. /mnt/workspace is persisted.
Resume the next day. New compute mounts the same storage. Both the agent and execute command see the same files.

The filesystem becomes the shared context that connects agent reasoning, deterministic operations, and time. Here’s what that looks like in code:

import boto3
import json

client = boto3.client('bedrock-agentcore', region_name='us-west-2')

AGENT_ARN = 'arn:aws:bedrock-agentcore:us-west-2:111122223333:runtime/my-coding-agent'
SESSION_ID = 'fefc1779-e5e7-49cf-a2c4-abaf478680c4'

def run_command(command, timeout=60):
    """Execute a shell command and return the exit code."""
    response = client.invoke_agent_runtime_command(
        agentRuntimeArn=AGENT_ARN,
        runtimeSessionId=SESSION_ID,
        contentType='application/json',
        accept='application/vnd.amazon.eventstream',
        body={'command': command, 'timeout': timeout}
    )
    for event in response.get('stream', []):
        if 'chunk' in event and 'contentStop' in event['chunk']:
            return event['chunk']['contentStop'].get('exitCode')
    return None

# Step 1: The agent analyzes the issue and writes a fix
# (Reasoning task → use InvokeAgentRuntime)
response = client.invoke_agent_runtime(
    agentRuntimeArn=AGENT_ARN,
    runtimeSessionId=SESSION_ID,
    payload=json.dumps({
        "prompt": "Read JIRA-1234 and implement the fix in /mnt/workspace"
    }).encode()
)
# Process agent response...

# Step 2: Run the test suite
# (Deterministic operation → use InvokeAgentRuntimeCommand)
exit_code = run_command('/bin/bash -c "cd /mnt/workspace && npm test"', timeout=300)

# Step 3: If tests pass, commit and push
# (Deterministic operation → use InvokeAgentRuntimeCommand)
if exit_code == 0:
    run_command('/bin/bash -c "cd /mnt/workspace && git checkout -b fix/JIRA-1234"')
    run_command('/bin/bash -c "cd /mnt/workspace && git add -A && git commit -m \'Fix JIRA-1234\'"')
    run_command('/bin/bash -c "cd /mnt/workspace && git push origin fix/JIRA-1234"')

The agent writes the code while the platform runs the commands. Each does what it’s designed for. Because /mnt/workspace is backed by managed session storage, you can stop this session, come back the next day, and the entire workspace is still there ready for the agent to continue iterating.

This is the pattern: the agent reasons, execute command acts, and the persistent filesystem remembers. The three capabilities form a loop that doesn’t break when you close your laptop.

Getting started

Both capabilities are now available. Here’s how to start using them:

Managed session storage (public preview) — Add filesystemConfigurations with sessionStorage when calling CreateAgentRuntime. Specify a mount path starting with /mnt. Everything your agent writes to that path persists across stop/resume cycles. Maximum allowed data is 1 GB per session.

Execute command — Call InvokeAgentRuntimeCommand with a command string and timeout on any active session. The command runs in the same container as your agent, with access to the same filesystem.

To get started with tutorials and sample code:

Execute command tutorial and samples on GitHub
Managed session storage via FileSystem on GitHub
Execute shell commands in AgentCore Runtime sessions — API documentation
Host agents with Amazon Bedrock AgentCore Runtime — Runtime overview

It started with two challenges: agents that lose their work when sessions stop, and deterministic operations that had to be routed through the LLM or built outside the runtime. Managed session storage and execute command addresses both of these challenges. The shared filesystem between them creates a development loop where the agent reasons, commands execute, and the work persists across sessions. Try out the new Amazon Bedrock AgentCore capabilities and let us know what you build.

About the Authors

Black Hat USA

AI Business

PSA: If you haven’t updated Llama.cpp for a couple of days and find MTP to not be performing well, update llamacpp.

Reddit r/LocalLLaMA

Linus Torvalds says Linux security list is becoming ‘unmanageable’ due to AI bug reports

The Verge

Native Windows Terminal for AI coding agents — no WSL

Dev.to

I gave my LLM 100,000+ tools. Here is what happened