How I Let an AI Refactor My Whole Codebase (Using Gemini 3.5)

Dev.to / 5/21/2026

💬 OpinionDeveloper Stack & InfrastructureIdeas & Deep AnalysisTools & Practical UsageModels & Research

共有:

Key Points

The author describes the difficulty of untangling a tightly coupled legacy monolith and extracting clean microservices as a solo developer facing months of manual work.
Inspired by Google I/O 2026’s agentic workflow announcements, they built an autonomous repository refactoring worker called “Flash.”
Flash reads the entire codebase, maps hidden dependencies, and produces clean pull requests that isolate functionality into independent microservices.
The project relies on Gemini 3.5 Flash’s large 1M-token context window and faster performance, avoiding the limitations of approaches that only retrieve a few files at a time.
The refactoring agent is implemented using Google’s Interactions API and a “thinking_level” setting to plan architecture before writing code.

Every solo developer knows the dread of a Friday afternoon deployment. You push a minor update to a user profile component, and somehow your entire payment processing pipeline goes offline. That is the harsh reality of living with a legacy monolith.
Over five years of rapid development, my side project turned into a massive web of tightly coupled logic. The user system relied on the billing system. The billing system relied on the notification service. Extracting these pieces into clean microservices was on my to do list for years. The problem is that I am just one person. I simply did not have the six months required to manually trace thousands of undocumented files and untangle the mess by myself.
When I watched the Google I/O 2026 keynote, the announcements around agentic workflows caught my attention immediately. I realized I could finally automate this nightmare. I built an autonomous extraction worker and named it Flash.
Flash is designed to read my entire repository, map the hidden dependencies, and generate clean pull requests that isolate code into independent microservices.

If you missed the developer announcements, you can watch the keynote embed below where Google revealed the new agent platforms and context windows that make this possible:

The Latency and Memory Problem

Building a codebase refactoring tool requires two things. You need massive memory and incredible speed.
Older setups required complex vector databases to retrieve code snippets. This approach always failed for refactoring because an AI cannot restructure a billing module if it can only see three files at a time. It needs the entire project context to understand how changing a variable impacts a file nested ten folders away.
Google revealed that Gemini 3.5 Flash operates with a 1 million token context window while being significantly faster than other frontier models. Most importantly for this project, it outscored heavier models on coding benchmarks. This meant I could feed the agent my whole repository at once without hitting timeouts or draining my personal cloud budget.

Step 1: Initializing the Architect

To start, I used the new Interactions API. This API handles stateful workflows natively so I do not have to write manual conversation loops. I configured the agent with the new thinking_level parameter. This forces the model to spend extra compute cycles planning the architecture before it starts writing any actual code.

import google.genai as genai

client = genai.Client(api_key="YOUR_API_KEY")

# Initialize the agent with high reasoning capabilities
flash_agent = client.agents.create(
    model="gemini-3.5-flash",
    name="MonolithExtractor",
    instructions=(
        "You are a principal software architect. Your goal is to analyze "
        "large monolithic codebases, identify domain boundaries, and "
        "extract tightly coupled modules into isolated microservices. "
        "Always prioritize preserving existing business logic."
    ),
    thinking_level="high"
)
print("Flash is online and ready to read code.")

Step 2: Mapping the Dependency Tree

An agent needs a way to understand the physical layout of a project. I created a standard Python function that acts as a tool for the agent. This tool scans directories and maps out how different files import each other.
By passing this tool directly to the model, Flash can build a reliable mental map of the entire monolith before deciding what to refactor.

import os

def generate_dependency_map(directory_path: str) -> dict:
    """Scans the repository and returns a map of file imports and dependencies."""
    dependency_map = {}

    for root, dirs, files in os.walk(directory_path):
        for file in files:
            if file.endswith(".py") or file.endswith(".ts"):
                file_path = os.path.join(root, file)
                # In a real app, you would parse the AST here to find imports
                dependency_map[file_path] = f"Parsed dependencies for {file}"

    return {"status": "success", "dependencies": dependency_map}

# Bind the tool to the Gemini agent
flash_agent.add_tool(generate_dependency_map)

Step 3: Feeding the Beast

Once the agent understands how to navigate the file structure, it needs to process the actual logic. Thanks to the massive context window of Gemini 3.5 Flash, I did not have to write a fragile chunking script. I simply passed the target directory path and let the agent read the files natively.
It uses the dependency tool first, and then it reads the code to figure out how to untangle the billing module.

# The path to my legacy backend repository
monolith_path = "/var/lib/legacy_backend_repo"

# The agent uses its token window to process the directory
response = flash_agent.chat(
    message=(
        f"Analyze the repository located at {monolith_path}. "
        "Isolate the billing module from the user profile system. "
        "Generate a separate microservice structure for billing."
    )
)

print("Refactoring Plan:")
print(response.text)

Step 4: Automating the Pull Requests

Identifying the tightly coupled code is only half the battle. To be truly useful, Flash needs to write the new microservice and submit it for review.
I added a GitHub integration tool. Once the agent finalizes a refactoring plan and writes the decoupled code, it automatically pushes those files to a new branch and requests a review from me.

def create_pull_request(branch_name: str, code_changes: dict, pr_notes: str) -> str:
    """Pushes extracted microservice code to a branch and opens a GitHub PR."""
    print(f"Creating new branch: {branch_name}")

    for filepath, content in code_changes.items():
        print(f"Writing updated logic to {filepath}...")

    print(f"Drafting PR description: {pr_notes}")

    return f"Pull request opened successfully on branch {branch_name}."

# Give Flash permission to commit code
flash_agent.add_tool(create_pull_request)

Step 5: Taking it to Production

Running a local script is great for a proof of concept. However, untangling an entire repository is an ongoing process.
Google launched Antigravity 2.0 as an orchestration platform for exactly this kind of agentic workload. Instead of managing my own servers or cron jobs, I wrapped my local agent into an Antigravity worker. Now, Flash lives in the cloud. It runs automatically in the background whenever I tag a GitHub issue with the label extract_service.

from google.antigravity import Orchestrator, AgentWorker

def deploy_flash_worker():
    # Initialize the Antigravity orchestration environment
    orchestrator = Orchestrator(project_name="MonolithToMicro")

    # Assign the agent to a cloud worker with a GitHub event trigger
    worker = AgentWorker(
        agent=flash_agent,
        github_repo="my-username/core-monolith",
        trigger="on_issue_labeled:extract_service"
    )

    orchestrator.add_worker(worker)
    orchestrator.deploy()

deploy_flash_worker()

The Results

The first time Flash opened a pull request, I just stared at my monitor in disbelief. The agent had successfully separated the Stripe API webhooks from my user authentication logic. It created a fresh directory, updated the internal import paths, and wrote a clean setup file for the new service.
It was not perfect out of the box. I still had to tweak a few environment variables during my code review. But it completed three weeks of tedious manual tracing in about forty five seconds.
By leveraging the speed and context size of Gemini 3.5 Flash, I turned a massive technical debt problem into an automated background process. As a solo developer, time is my most valuable asset. Thanks to Flash, I can finally stop fearing my own codebase and get back to building new features.

Black Hat USA

AI Business

Demystifying AI Agents: Building an Agentic Pipeline From Scratch in Pure Python

Dev.to

Today's AI & Tech Digest: Lightweight Models, Scientific Breakthroughs, and the Provenance Battle (2026-05-21)

Dev.to

Coding Agents Are Becoming Remote Workers. Enterprises Need an Agent Harness.

Dev.to

Flutter 3.44 Highlights From Google I/O 2026: What's New and What Matters

Dev.to

How I Let an AI Refactor My Whole Codebase (Using Gemini 3.5)

Key Points

The Latency and Memory Problem

Step 1: Initializing the Architect

Step 2: Mapping the Dependency Tree

Step 3: Feeding the Beast

Step 4: Automating the Pull Requests

Step 5: Taking it to Production

The Results

Related Articles

Black Hat USA

Demystifying AI Agents: Building an Agentic Pipeline From Scratch in Pure Python

Today's AI & Tech Digest: Lightweight Models, Scientific Breakthroughs, and the Provenance Battle (2026-05-21)

Coding Agents Are Becoming Remote Workers. Enterprises Need an Agent Harness.

Flutter 3.44 Highlights From Google I/O 2026: What's New and What Matters

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer