How I Review PRs with AI — Without Losing My Own Judgment

Dev.to / 4/26/2026

💬 OpinionDeveloper Stack & InfrastructureIdeas & Deep AnalysisTools & Practical Usage

Key Points

  • The author describes how AI-assisted code review can speed up handling larger, more complex PRs introduced by agentic coding, while still keeping architectural judgment with the reviewer.
  • They recommend a “context isolation” rule of using one dedicated AI session per PR, so the reviewer retains the mental model across days until merge.
  • The approach uses a repeatable, tool-agnostic set of prompts (available on GitHub) to split review work into distinct stages rather than running a single generic AI pass.
  • Their workflow begins with a human-led understanding phase, then proceeds through additional phases that delegate heavy scanning to AI while preserving the reviewer’s responsibility for final decisions.

Originally published on Medium:
https://medium.com/@rajkundalia/how-i-review-prs-with-ai-without-losing-my-own-judgment-f930ad30dc60

Over the last few months, my code review queue has changed completely. With agentic coding, PRs are larger, faster, and harder to reason about.

I needed a system that was faster, but I absolutely did not want to just hand things off to an AI and call it a review.

Built-in tools exist. Claude Code has /review or /deep-review, and GitHub Copilot's PR review is decent out of the box. If you just want an AI pass, they work fine. But I am not optimizing for just an AI pass; I am optimizing for understanding and architectural signal.

Here is a repeatable framework I use to let AI handle the heavy scanning, while I keep the heavy thinking and judgment firmly in my own hands.

(Note: All the prompts referenced below are open-source in my GitHub repo: 👉 https://github.com/rajkundalia/ai-code-review-prompts. They are tool-agnostic — paste them into Claude, ChatGPT, Cursor, or whatever you prefer.)

The Golden Rule: Context Isolation

Before we get into the phases, there is one non-negotiable rule that makes this entire system work: One AI session per PR.

If you mix your own daily work, multiple PR reviews, and random questions into a single AI session, you lose context. PR reviews are context-heavy. When a colleague replies to your comment four days later, having a dedicated, preserved AI session helps you instantly remember your mental model and why you left that comment in the first place.

Keep the thread alive from the start of the review through the merge.

The 4-Phase PR Review Workflow

When I load my initial prompt, it gives me a starting point: a high-level summary, the files touched, and the core intent of the PR. From there, I move through four distinct phases. Do not skip ahead.

Phase 1: Build Understanding (Human First)

What happens next is entirely mine. I go file by file, line by line, and ask the AI questions until I have built my own understanding of the flow:

  • What is this doing?
  • Where is this data model used further downstream?
  • What breaks if this assumption changes?

This is deliberately manual. Anything I still do not understand after interrogating the AI, I flag for a human comment.

If you skip this phase, you're not reviewing the code — you're reviewing the AI's opinion of the code.

Phase 2: AI First Pass (Filter the Noise)

This is where the AI does its first real pass, flagging standard issues and inconsistencies. This is intentionally a surface pass.

The reason this is a separate phase from the deep review is simple: I want the obvious stuff caught and out of the way early. It gives me a chance to dismiss irrelevant suggestions immediately, ensuring the next phase isn't cluttered with noise.

👉 Think of this as signal extraction, not decision-making.

Phase 3: The Deep Review (Pressure Testing)

This is the heaviest phase, driven by a few specific forcing functions:

The "Chief Programmer" & "Chief Architect" Persona
Giving the AI a specific role produces sharper, more critical output than a generic "review this code." You can adjust the role to fit your domain, e.g., chief AI engineer if you are reviewing prompt code.

Real Coverage vs. Theater
AI agents generate a massive amount of tests. Left unchecked, they will write tests for data models with no logic, or tests that just verify Python works. I explicitly prompt the AI to look for meaningful behavior validation so we catch the noise upfront. It is better than constantly asking the AI to remove redundant tests.

Tests should prove behavior, not existence.

Playing Devil’s Advocate
I force the LLM to question its own assumptions. What could go wrong? Where would this fail in production three months from now?

This surfaces edge cases that standard reviews easily miss.

Phase 4: The Verdict

Finally, I combine my Phase 1 understanding with the AI's deep review insights. The AI helps me classify the findings into:

  • Must-fix blockers
  • Good-to-have stylistic suggestions
  • Noise to be discarded

The Author's Duty: Self-Review

Before your code ever reaches another human, it is your responsibility to review it.

I converted my PR review framework into a self-review prompt. I run through the exact same phases on my own code. The output here is highly surgical: it tells me the file, the line, what is wrong, and what to do instead.

The goal is simple:

The comments you eventually get from your peers should be about high-level design decisions — not trivial things you could have caught yourself.

You get serious brownie points for consistently raising high-quality, pre-vetted PRs.

Scaling the Process

Not every PR needs all four phases.

  • A 10-line config change → quick pass
  • A 1,000-line refactor → full deep review

Match the depth of review to the risk and complexity.

Over-reviewing small changes is wasteful.
Under-reviewing large ones is dangerous.

Final Thoughts

I am not offloading my thinking to an AI. I am using it to explore faster, validate assumptions, and stress-test decisions. The thinking is still mine.

The leverage is new.
The responsibility isn’t.

These tools are incredibly powerful — but you still need to hold the leash.

I’ve open-sourced the prompts and guidelines I use:
👉 https://github.com/rajkundalia/ai-code-review-prompts

If you have better ideas, improvements, or ways to reduce noise — I’d genuinely like to see them.