BuildWithAI: What Broke, What I Learned, What's Next

Dev.to / 4/5/2026

💬 OpinionDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical Usage

共有:

Key Points

The author describes building and deploying a “BuildWithAI” serverless toolkit on AWS, focusing especially on the practical deployment issues that weren’t covered earlier (architecture, prompts, and cost guardrails).
The first major failure was Amazon Bedrock returning an “access denied” error at runtime, traced to model access not being enabled for the Anthropic Claude models used during the initial deployment.
They explain that manual one-time enablement is required for certain Bedrock model providers (Anthropic requires a First Time Use/FTU form), while some newer models like Amazon Nova may be enabled by default.
The article aims to help readers fork the repository and deploy it themselves by outlining the specific gotchas and the steps needed to run the system on a personal AWS account.
It highlights that some access problems can look like IAM issues but actually stem from Bedrock model account/org-level configuration and marketplace/FTU prerequisites.

Overview

The architecture and the prompts are covered. Now for the part that usually gets left out: what actually broke, what could be better, and how to deploy the whole thing on your own AWS account.

So far we've gone through the serverless stack and 5-layer cost guardrails, then the system prompt pattern and the prompt engineering behind all six tools. This final part is the practical side — the gotchas from development and a step-by-step guide so you can fork the repo and get it running yourself.

Things that broke

Bedrock model access

First deploy went fine. Lambda functions created, API Gateway live, DynamoDB provisioned. Then the first endpoint returned access denied from Bedrock. No helpful error message, just a generic denial.

The issue: when I first deployed this using Claude Sonnet & Haiku, model access had to be enabled manually before you could call the model. It's a one-time step. I initially assumed it was an IAM policy issue and spent time debugging the wrong thing. But for Amazon Nova, this shouldn't be the case as it is enabled by default.

Note: As of late 2025, Bedrock foundation models are available by default without manual enablement — including Anthropic's.

However, Anthropic models still have one unique requirement: a one-time First Time Use (FTU) form must be submitted before your first Claude invocation. You can complete this by selecting any Anthropic model from the model catalog in the Amazon Bedrock console, or by calling the PutUseCaseForModelAccess API. Once submitted at the account or org level, it's inherited across all accounts in the same AWS Organization.

Additionally, ensure your IAM role has the necessary AWS Marketplace permissions (aws-marketplace:Subscribe, aws-marketplace:Unsubscribe, aws-marketplace:ViewSubscriptions) and that your AWS account has a valid payment method configured — Bedrock auto-subscribes to the model in the background on first invocation, and these permissions are required for that to succeed.

CORS on error responses

The Lambda functions returned correct results via curl and the smoke test. But the frontend got "Failed to fetch" errors.

The problem: the response helper was setting CORS headers on success responses but not on error responses. When a Lambda returned 400 or 429, the browser blocked the entire response.

The fix — every response path must include CORS headers:

CORS_HEADERS = {
    "Access-Control-Allow-Origin": "*",
    "Access-Control-Allow-Headers": "Content-Type",
    "Content-Type": "application/json",
}

def ok(data):
    return {"statusCode": 200, "headers": CORS_HEADERS, "body": json.dumps(data)}

def error(status, message, code):
    return {"statusCode": status, "headers": CORS_HEADERS,
            "body": json.dumps({"error": message, "code": code})}

The Lambda response headers use * for the origin because the response helper doesn't know the CloudFront domain. The actual origin restriction happens at the API Gateway layer, where allowedOrigins is scoped to the CloudFront domain only. The Lambda-level * is fine here because the API uses rate limiting and daily caps for protection, not auth tokens.

The lesson I keep re-learning: always test error paths from the actual frontend, not just curl. curl doesn't care about CORS.

The DynamoDB seed step

After first deploy, python scripts/seed_dynamodb.py needs to run to write the tools_enabled: true config row. Without it, the budget shutoff Lambda (Layer 5 from Part 1) has no row to write to — the safety net isn't connected.

"""Run once after first deploy."""
import boto3

dynamodb = boto3.resource("dynamodb", region_name="ap-southeast-1")
table = dynamodb.Table("dr-toolkit-usage")

table.put_item(Item={
    "pk": "config",
    "sk": "global",
    "tools_enabled": True,
    "disabled_reason": None,
})
print("Config seeded — tools_enabled: True")

This could probably be handled by a custom resource in CloudFormation, but for a project this size, a one-line script after deploy is simpler.

What could be improved

Streaming responses. Right now users wait 2-5 seconds for the full response. Bedrock supports invoke_model_with_response_stream — output could appear word-by-word. The single biggest UX improvement available.

Better observability. The toolkit has CloudWatch logs but no structured metrics. A dashboard showing calls per tool, error rates, and token usage would be a solid addition.

Input validation. The Lambdas accept whatever the frontend sends with no schema validation. Quick fix that would eliminate a class of unexpected errors.

Deploy it yourself

Here's how to get the toolkit running on your own AWS account.

Prerequisites

AWS CLI configured (aws sts get-caller-identity works)
Node.js ≥ 24 (for Serverless Framework and Next.js)
Python 3.14 (update runtime in serverless.yml if using a different version)
Bedrock model access enabled for the models you want to use:
- Current defaults: amazon.nova-pro-v1:0 and amazon.nova-lite-v1:0
- Also works with Claude, Nova Premier, or any model in the Bedrock Model Catalog
- Check models.config.json for the exact model IDs your deployment uses

Deploy steps

# 1. Clone the repo
git clone https://github.com/romarcablao/dr-toolkit-on-aws.git
cd dr-toolkit-on-aws

# 2. Update `models.config.json` and deploy everything (backend + frontend + throttle + cache invalidation)
./scripts/deploy.sh

# 3. Seed DynamoDB (first deploy only)
python scripts/seed_dynamodb.py

# 4. Smoke test all 6 endpoints
python scripts/test_tools.py <API_URL>

The deploy script handles: npx serverless deploy, API Gateway throttle configuration, generating the frontend config from models.config.json, building the Next.js static export, syncing to S3, and invalidating CloudFront cache.

Partial deploys are also supported:

./scripts/deploy.sh --skip-backend    # frontend only
./scripts/deploy.sh --skip-frontend   # backend only

After deploy

Update CORS in serverless.yml with your CloudFront domain:

httpApi:
  cors:
    allowedOrigins:
      - 'https://your-cloudfront-domain.cloudfront.net'

Set up the budget alert: AWS Console → Billing → Budgets → Create budget → $10/month → SNS action at 100% pointing to dr-toolkit-budget-alert.

Emergency controls

# Disable all tools immediately
aws dynamodb put-item \
  --table-name dr-toolkit-usage \
  --region ap-southeast-1 \
  --item '{"pk":{"S":"config"},"sk":{"S":"global"},"tools_enabled":{"BOOL":false}}'

# Re-enable
aws dynamodb put-item \
  --table-name dr-toolkit-usage \
  --region ap-southeast-1 \
  --item '{"pk":{"S":"config"},"sk":{"S":"global"},"tools_enabled":{"BOOL":true}}'

Adding your own tools

Lambda handler — copy any handler in functions/, change TOOL_NAME and the system prompt
Config — add the tool to models.config.json
Route — add a function block in serverless.yml with an httpApi event
Frontend — create a page under frontend/src/app/tools/your-tool/page.tsx using the useToolSubmit hook
Homepage — add a card to the tools array
Deploy: ./scripts/deploy.sh

What's next — your turn

The architecture is in Part 1. The prompts are in Part 2. The deploy steps are above. Here's the challenge:

Deploy this toolkit to your own AWS account.

Fork the repo, run ./scripts/deploy.sh, and get it running. Don't forget to setup the budget. It takes about 10 minutes and the guardrails keep costs under $10/month.

Once it's running, try these:

Paste one of your own CloudFormation templates into the DR Reviewer. See what gaps it catches.
Run the DR Strategy Advisor with your actual infrastructure parameters. Compare the recommendation to what's in place today.
Throw real incident notes into the Post-Mortem Writer. See if the structured output is something you'd actually use.

And if you want to go further:

Add a 7th tool with Kiro. This is how the original six were built. Open the project in Kiro, describe the tool you want in natural language ("a compliance checker that takes an AWS config and flags policy violations"), and let Kiro generate a spec with requirements and an implementation plan before writing any code. Kiro's spec-driven workflow means you get the handler, the system prompt, and the config entry scaffolded from a structured plan rather than freehand prompting. Security audit, cost optimization, compliance check — same architecture, different prompts. The handler pattern from Part 2 means the code side is mostly copy-paste; the interesting part is writing the spec and tuning the system prompt.

Improve what's here. Streaming responses, input validation, a CloudWatch dashboard.

Wrapping up

This series covered the full lifecycle of a serverless AI project on AWS: architecture design (Part 1), prompt engineering (Part 2), and the real-world lessons and deployment (Part 3).

The DR strategies the toolkit recommends — backup & restore, pilot light, warm standby, multi-site active/active — come straight from the AWS Disaster Recovery whitepaper. That whitepaper is excellent, but there's a gap between understanding the four strategies and having an actual runbook for your infrastructure. These tools try to close that gap.

Try it / Fork it:

Live Demo: https://dr-toolkit.thecloudspark.com

DR Toolkit

AI-powered disaster recovery planning tool for AWS builders. Plan, document, and audit your DR posture with Amazon Bedrock. Resilience planning, accelerated by generative AI.

dr-toolkit.thecloudspark.com

Source Code: github.com/romarcablao/dr-toolkit-on-aws

romarcablao / dr-toolkit-on-aws

BuildWithAI: DR Toolkit on AWS

DR Toolkit on AWS

AI-powered disaster recovery planning tool for AWS builders. Plan, document, and audit your DR posture with Amazon Bedrock. Resilience planning, accelerated by generative AI.

Tools

#	Tool	Endpoint	Model	Daily Limit
1	Runbook Generator	POST /runbook	Nova Pro	50/day
2	RTO/RPO Estimator	POST /rto-estimator	Nova Lite	50/day
3	DR Strategy Advisor	POST /dr-advisor	Nova Lite	50/day
4	Post-Mortem Writer	POST /postmortem	Nova Lite	50/day
5	DR Checklist Builder	POST /checklist	Nova Lite	50/day
6	Template DR Reviewer	POST /dr-reviewer	Nova Pro	30/day

Architecture

Frontend: Next.js 16 (static export) + Tailwind CSS → S3 + CloudFront
Backend: AWS Lambda (Python 3.14) → API Gateway HTTP API
AI: Amazon Bedrock — Nova Lite (Tools 2–5), Nova Pro (Tools 1, 6)
Database: DynamoDB single table dr-toolkit-usage (usage counters + feature flag)
IaC: Serverless Framework v3 (serverless.yml)
Region: ap-southeast-1 (Singapore)

Project Structure

dr-toolkit/
├── serverless.yml             # Serverless Framework

…

View on GitHub

References:

Black Hat USA

AI Business

Black Hat Asia