AI Navigate

I Built an AI Legal OS with 60 Specialized Agents and Real-time Statute Verification

Dev.to / 3/12/2026

📰 NewsDeveloper Stack & InfrastructureTools & Practical UsageModels & Research

Key Points

  • Lawmadi OS is an AI legal operating system with 60 domain-specialized agents that verify every answer against live Korean government databases to prevent legal hallucinations.
  • It uses a 3-layer NLU routing approach to achieve low latency, high accuracy (264/264 test cases) and cost efficiency.
  • Each of the 60 agents specializes in a distinct area of Korean law, enabling precise, statute-aware responses across domains like labor, leases, divorce, traffic, and criminal law.
  • The system features a 4-stage verification pipeline that extracts statute citations, queries law.go.kr via API, verifies existence and content accuracy, and assigns a 0-100 verification score to decide whether to reject a response.
  • It cross-references 10 government data sources (statutes, decrees, rules, court precedents, administrative rules, etc.) and uses a fail-closed design to prevent releasing incorrect legal information.

When people in Korea face legal issues, they have three bad options: expensive lawyers ($75+ per session), unreliable internet searches, or AI chatbots that hallucinate laws that don't exist.

I built Lawmadi OS to fix this — an AI legal operating system with 60 domain-specialized agents that verify every answer against live government databases.

Live: lawmadi.com

The Problem with Legal AI

Ask ChatGPT about Korean labor law, and it will confidently cite "Article 27 of the Labor Standards Act" — except that article might not say what it claims, or might not exist at all. In the legal domain, hallucination isn't just annoying — it's dangerous. People make life-changing decisions based on legal information.

How Lawmadi OS Works

3-Layer NLU Routing

Instead of sending every query through an expensive LLM classification step, we use cascading routing:

This gives us low latency for most queries, high accuracy (264/264 test cases passing), and cost efficiency.

60 Specialized Agents

Each of the 60 agents specializes in a specific area of Korean law:

  • L09 담우 — Labor Law (unfair dismissal, unpaid wages)
  • L08 온유 — Lease/Rent Law (전세 deposits, tenant rights)
  • L03 담슬 — Divorce & Family Law
  • L10 결휘 — Traffic Accidents
  • L01 휘율 — Criminal Law
  • And 55 more covering tax, IP, immigration, inheritance, medical, military, environment, data privacy, startups, etc.

Why 60 instead of 1 generalist? Specialization matters. Each agent has domain-tuned prompts, knowledge of relevant statutes, and optimized response patterns. It's like having a law firm with 60 specialists instead of one generalist.

4-Stage Verification Pipeline

This is the core architecture:

Stage 4 is what makes Lawmadi OS different. After Gemini generates a response, we:

  1. Extract all statute citations from the response
  2. Query Korea's official legislative database (법제처, law.go.kr) via DRF API
  3. Verify — Does the law exist? Does the article number exist? Is the content accurate?
  4. Score — Generate a 0-100 verification score
  5. Decide — If score is below threshold, reject the entire response

We cross-reference against 10 government data sources:

  • Statutes (법령)
  • Enforcement Decrees (시행령)
  • Enforcement Rules (시행규칙)
  • Court Precedents (판례)
  • Administrative Rules (행정규칙)
  • And 5 more

Fail-Closed Design

If the verification API is down, the system doesn't fall back to unverified responses. Instead:

  • Circuit breaker trips after consecutive failures
  • System enters fail-closed mode
  • All responses are held until verification is available
  • We'd rather give no answer than an unverified one

The 5-Stage Empathy Framework

Legal issues are stressful. Every response follows this structure:

  1. Emotional acknowledgment — "This situation must be frustrating..."
  2. Situation diagnosis — Clear analysis of the legal issue
  3. Action roadmap — Specific steps with deadlines
  4. Safety net — Legal aid resources, hotlines, government services
  5. Supportive closing — Encouragement and next steps

Results After 1 Week

Metric Value
Unique Visitors 114
Queries Processed 481
Success Rate 99.6%
Avg Verification Score 84.7/100
Korean Citation Accuracy 82.5%
English Citation Accuracy 25.6% (improving)
Tests Passing 282/282
Avg Response Time ~40s

Most popular domains: Labor law (90 queries), Housing/Lease (83), Divorce (50), Traffic accidents (48)

Tech Stack

Component Technology
Backend FastAPI 0.128.0 + Python 3.10+
LLM Google Gemini 2.5 Flash
RAG Vertex AI Search (14,601 docs)
Verification 법제처 DRF API (10 SSOT sources)
Database Cloud SQL PostgreSQL 17
Hosting GCP Cloud Run + Firebase
Billing Paddle (credit-based)
CI/CD GitHub Actions (5-stage pipeline)
Auth JWT RBAC + Email OTP
Anti-abuse IP + Canvas Fingerprint + Device Token

Pricing

  • Free: 2 queries/day (no account needed)
  • Starter: 20 queries — .50
  • Standard: 100 queries — .99
  • Pro: 300 queries — .99

Credit-based, no subscription. Powered by Paddle.

Challenges & Next Steps

  1. Latency — ~40s avg is too slow. Gemini generation (~30s) is the bottleneck. Exploring parallel RAG + prefetch.
  2. English citations — 25.6% accuracy vs 82.5% Korean. Standardized English translations of Korean law names are inconsistent.
  3. Scale — 60 system prompts to maintain. Considering automated prompt generation.

Try It

I'd love to hear your thoughts, especially on:

  • Multi-agent specialization vs. single generalist approaches
  • Fail-closed verification in AI systems
  • Ideas for reducing response latency

Built by Jainam Choe — choepeter@outlook.kr