BuildWithAI: What Broke, What I Learned, What's Next

Dev.to / 4/5/2026

💬 OpinionDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical Usage

Key Points

  • The author describes building and deploying a “BuildWithAI” serverless toolkit on AWS, focusing especially on the practical deployment issues that weren’t covered earlier (architecture, prompts, and cost guardrails).
  • The first major failure was Amazon Bedrock returning an “access denied” error at runtime, traced to model access not being enabled for the Anthropic Claude models used during the initial deployment.
  • They explain that manual one-time enablement is required for certain Bedrock model providers (Anthropic requires a First Time Use/FTU form), while some newer models like Amazon Nova may be enabled by default.
  • The article aims to help readers fork the repository and deploy it themselves by outlining the specific gotchas and the steps needed to run the system on a personal AWS account.
  • It highlights that some access problems can look like IAM issues but actually stem from Bedrock model account/org-level configuration and marketplace/FTU prerequisites.

Overview

The architecture and the prompts are covered. Now for the part that usually gets left out: what actually broke, what could be better, and how to deploy the whole thing on your own AWS account.

BuildWithAI: DR Toolkit on AWS — DESIGN, PROMPT, LEARN

So far we've gone through the serverless stack and 5-layer cost guardrails, then the system prompt pattern and the prompt engineering behind all six tools. This final part is the practical side — the gotchas from development and a step-by-step guide so you can fork the repo and get it running yourself.

Things that broke

Bedrock model access

First deploy went fine. Lambda functions created, API Gateway live, DynamoDB provisioned. Then the first endpoint returned access denied from Bedrock. No helpful error message, just a generic denial.

The issue: when I first deployed this using Claude Sonnet & Haiku, model access had to be enabled manually before you could call the model. It's a one-time step. I initially assumed it was an IAM policy issue and spent time debugging the wrong thing. But for Amazon Nova, this shouldn't be the case as it is enabled by default.

Screenshot: Amazon Bedrock Model Catalog showing available models

Note: As of late 2025, Bedrock foundation models are available by default without manual enablement — including Anthropic's.

However, Anthropic models still have one unique requirement: a one-time First Time Use (FTU) form must be submitted before your first Claude invocation. You can complete this by selecting any Anthropic model from the model catalog in the Amazon Bedrock console, or by calling the PutUseCaseForModelAccess API. Once submitted at the account or org level, it's inherited across all accounts in the same AWS Organization.

Additionally, ensure your IAM role has the necessary AWS Marketplace permissions (aws-marketplace:Subscribe, aws-marketplace:Unsubscribe, aws-marketplace:ViewSubscriptions) and that your AWS account has a valid payment method configured — Bedrock auto-subscribes to the model in the background on first invocation, and these permissions are required for that to succeed.

CORS on error responses

The Lambda functions returned correct results via curl and the smoke test. But the frontend got "Failed to fetch" errors.

The problem: the response helper was setting CORS headers on success responses but not on error responses. When a Lambda returned 400 or 429, the browser blocked the entire response.

The fix — every response path must include CORS headers:

CORS_HEADERS = {
    "Access-Control-Allow-Origin": "*",
    "Access-Control-Allow-Headers": "Content-Type",
    "Content-Type": "application/json",
}

def ok(data):
    return {"statusCode": 200, "headers": CORS_HEADERS, "body": json.dumps(data)}

def error(status, message, code):
    return {"statusCode": status, "headers": CORS_HEADERS,
            "body": json.dumps({"error": message, "code": code})}

The Lambda response headers use * for the origin because the response helper doesn't know the CloudFront domain. The actual origin restriction happens at the API Gateway layer, where allowedOrigins is scoped to the CloudFront domain only. The Lambda-level * is fine here because the API uses rate limiting and daily caps for protection, not auth tokens.

The lesson I keep re-learning: always test error paths from the actual frontend, not just curl. curl doesn't care about CORS.

The DynamoDB seed step

After first deploy, python scripts/seed_dynamodb.py needs to run to write the tools_enabled: true config row. Without it, the budget shutoff Lambda (Layer 5 from Part 1) has no row to write to — the safety net isn't connected.

"""Run once after first deploy."""
import boto3

dynamodb = boto3.resource("dynamodb", region_name="ap-southeast-1")
table = dynamodb.Table("dr-toolkit-usage")

table.put_item(Item={
    "pk": "config",
    "sk": "global",
    "tools_enabled": True,
    "disabled_reason": None,
})
print("Config seeded — tools_enabled: True")

This could probably be handled by a custom resource in CloudFormation, but for a project this size, a one-line script after deploy is simpler.

What could be improved

Streaming responses. Right now users wait 2-5 seconds for the full response. Bedrock supports invoke_model_with_response_stream — output could appear word-by-word. The single biggest UX improvement available.

Better observability. The toolkit has CloudWatch logs but no structured metrics. A dashboard showing calls per tool, error rates, and token usage would be a solid addition.

Input validation. The Lambdas accept whatever the frontend sends with no schema validation. Quick fix that would eliminate a class of unexpected errors.

Deploy it yourself

Here's how to get the toolkit running on your own AWS account.

Prerequisites

  • AWS CLI configured (aws sts get-caller-identity works)
  • Node.js ≥ 24 (for Serverless Framework and Next.js)
  • Python 3.14 (update runtime in serverless.yml if using a different version)
  • Bedrock model access enabled for the models you want to use:
    • Current defaults: amazon.nova-pro-v1:0 and amazon.nova-lite-v1:0
    • Also works with Claude, Nova Premier, or any model in the Bedrock Model Catalog
    • Check models.config.json for the exact model IDs your deployment uses

Deploy steps

# 1. Clone the repo
git clone https://github.com/romarcablao/dr-toolkit-on-aws.git
cd dr-toolkit-on-aws

# 2. Update `models.config.json` and deploy everything (backend + frontend + throttle + cache invalidation)
./scripts/deploy.sh

# 3. Seed DynamoDB (first deploy only)
python scripts/seed_dynamodb.py

# 4. Smoke test all 6 endpoints
python scripts/test_tools.py <API_URL>

The deploy script handles: npx serverless deploy, API Gateway throttle configuration, generating the frontend config from models.config.json, building the Next.js static export, syncing to S3, and invalidating CloudFront cache.

Partial deploys are also supported:

./scripts/deploy.sh --skip-backend    # frontend only
./scripts/deploy.sh --skip-frontend   # backend only

After deploy

Update CORS in serverless.yml with your CloudFront domain:

httpApi:
  cors:
    allowedOrigins:
      - 'https://your-cloudfront-domain.cloudfront.net'

Set up the budget alert: AWS Console → Billing → Budgets → Create budget → $10/month → SNS action at 100% pointing to dr-toolkit-budget-alert.

Emergency controls

# Disable all tools immediately
aws dynamodb put-item \
  --table-name dr-toolkit-usage \
  --region ap-southeast-1 \
  --item '{"pk":{"S":"config"},"sk":{"S":"global"},"tools_enabled":{"BOOL":false}}'

# Re-enable
aws dynamodb put-item \
  --table-name dr-toolkit-usage \
  --region ap-southeast-1 \
  --item '{"pk":{"S":"config"},"sk":{"S":"global"},"tools_enabled":{"BOOL":true}}'

Adding your own tools

  1. Lambda handler — copy any handler in functions/, change TOOL_NAME and the system prompt
  2. Config — add the tool to models.config.json
  3. Route — add a function block in serverless.yml with an httpApi event
  4. Frontend — create a page under frontend/src/app/tools/your-tool/page.tsx using the useToolSubmit hook
  5. Homepage — add a card to the tools array
  6. Deploy: ./scripts/deploy.sh

What's next — your turn

The architecture is in Part 1. The prompts are in Part 2. The deploy steps are above. Here's the challenge:

Deploy this toolkit to your own AWS account.

Fork the repo, run ./scripts/deploy.sh, and get it running. Don't forget to setup the budget. It takes about 10 minutes and the guardrails keep costs under $10/month.

Once it's running, try these:

  • Paste one of your own CloudFormation templates into the DR Reviewer. See what gaps it catches.
  • Run the DR Strategy Advisor with your actual infrastructure parameters. Compare the recommendation to what's in place today.
  • Throw real incident notes into the Post-Mortem Writer. See if the structured output is something you'd actually use.

And if you want to go further:

Add a 7th tool with Kiro. This is how the original six were built. Open the project in Kiro, describe the tool you want in natural language ("a compliance checker that takes an AWS config and flags policy violations"), and let Kiro generate a spec with requirements and an implementation plan before writing any code. Kiro's spec-driven workflow means you get the handler, the system prompt, and the config entry scaffolded from a structured plan rather than freehand prompting. Security audit, cost optimization, compliance check — same architecture, different prompts. The handler pattern from Part 2 means the code side is mostly copy-paste; the interesting part is writing the spec and tuning the system prompt.

Improve what's here. Streaming responses, input validation, a CloudWatch dashboard.

Wrapping up

This series covered the full lifecycle of a serverless AI project on AWS: architecture design (Part 1), prompt engineering (Part 2), and the real-world lessons and deployment (Part 3).

BuildWithAI Series Banner

The DR strategies the toolkit recommends — backup & restore, pilot light, warm standby, multi-site active/active — come straight from the AWS Disaster Recovery whitepaper. That whitepaper is excellent, but there's a gap between understanding the four strategies and having an actual runbook for your infrastructure. These tools try to close that gap.

Try it / Fork it:

Live Demo: https://dr-toolkit.thecloudspark.com

DR Toolkit

AI-powered disaster recovery planning tool for AWS builders. Plan, document, and audit your DR posture with Amazon Bedrock. Resilience planning, accelerated by generative AI.

favicon dr-toolkit.thecloudspark.com



Source Code: github.com/romarcablao/dr-toolkit-on-aws

GitHub logo romarcablao / dr-toolkit-on-aws

BuildWithAI: DR Toolkit on AWS

DR Toolkit on AWS

DR Toolkit

AI-powered disaster recovery planning tool for AWS builders. Plan, document, and audit your DR posture with Amazon Bedrock. Resilience planning, accelerated by generative AI.

Kiro Amazon Bedrock AWS Lambda Amazon DynamoDB Amazon S3 Amazon CloudFront Next.js Tailwind CSS

Tools

# Tool Endpoint Model Daily Limit
1 Runbook Generator POST /runbook Nova Pro 50/day
2 RTO/RPO Estimator POST /rto-estimator Nova Lite 50/day
3 DR Strategy Advisor POST /dr-advisor Nova Lite 50/day
4 Post-Mortem Writer POST /postmortem Nova Lite 50/day
5 DR Checklist Builder POST /checklist Nova Lite 50/day
6 Template DR Reviewer POST /dr-reviewer Nova Pro 30/day

DR Toolkit Tools

Architecture

  • Frontend: Next.js 16 (static export) + Tailwind CSS → S3 + CloudFront
  • Backend: AWS Lambda (Python 3.14) → API Gateway HTTP API
  • AI: Amazon Bedrock — Nova Lite (Tools 2–5), Nova Pro (Tools 1, 6)
  • Database: DynamoDB single table dr-toolkit-usage (usage counters + feature flag)
  • IaC: Serverless Framework v3 (serverless.yml)
  • Region: ap-southeast-1 (Singapore)

Project Structure

dr-toolkit/
├── serverless.yml             # Serverless Framework

References: