Markdown source
Reviewing AI-Generated Code Markdown source

Readable source view for humans. The raw Markdown endpoint remains available for crawlers and agent readers.
---
title: "Reviewing AI-Generated Code"
description: "A checklist and mental model for reviewing code you didn't write — what to look for when your coding agent hands back a diff."
kind: guide
maturity: budding
confidence: medium
origin: ai-drafted
author: "Agent"
directedBy: "krow"
tags: [agentic-coding, patterns]
published: 2026-03-21
modified: 2026-04-21
wordCount: 723
readingTime: 4
prerequisites: [agentic-coding-getting-started]
related: [building-krowdev-with-agents, claude-md-patterns, agentic-coding-getting-started]
url: https://krowdev.com/guide/reviewing-ai-generated-code/
---
## Agent Context

- Canonical: https://krowdev.com/guide/reviewing-ai-generated-code/
- Markdown: https://krowdev.com/guide/reviewing-ai-generated-code.md
- Full corpus: https://krowdev.com/llms-full.txt
- Kind: guide
- Maturity: budding
- Confidence: medium
- Origin: ai-drafted
- Author: Agent
- Directed by: krow
- Published: 2026-03-21
- Modified: 2026-04-21
- Words: 723 (4 min read)
- Tags: agentic-coding, patterns
- Prerequisites: agentic-coding-getting-started
- Related: building-krowdev-with-agents, claude-md-patterns, agentic-coding-getting-started
- Content map:
  - h2: The Trust Gradient
  - h2: What Agents Get Wrong
  - h3: Wrong Framework Version
  - h3: Dependency Creep
  - h3: Over-Engineering
  - h3: Inconsistent Patterns
  - h3: Silent Assumptions
  - h2: The Review Checklist
  - h2: The "Read the Diff" Habit
  - h2: After the Review
  - h2: Sources
- Crawl policy: same canonical content is exposed through HTML, Markdown, and llms-full; no crawler-specific content gate.

The agent writes the code. You own it. That means every line it produces is your responsibility — and you need a systematic way to review it.

## The Trust Gradient

Not all agent output deserves the same scrutiny. Calibrate review depth by risk:

| Risk Level | Examples | Review Approach |
|---|---|---|
| **Low** | CSS tweaks, adding a test, formatting | Scan the diff, verify it builds |
| **Medium** | New component, refactoring, API changes | Read every line, test manually |
| **High** | Auth logic, data mutations, build config | Read every line, trace the control flow, verify edge cases |

The mistake is treating everything as low-risk. The agent will happily modify your build pipeline with the same confidence it uses to fix a typo. The [krowdev retrospective](/article/building-krowdev-with-agents/) has concrete examples of this in a real project.

## What Agents Get Wrong

These failure modes appear consistently across projects and models:

### Wrong Framework Version

The agent's training data includes multiple versions of every framework. It will confidently generate Astro 4 patterns when you need Astro 6, or React class components when you use hooks. Check imports and API calls against your actual framework version.

### Dependency Creep

Ask for one feature, get three new npm packages. Agents default to installing libraries for things the standard library already handles. Before accepting a new dependency: check if the feature exists natively, check the package size, and check when it was last maintained.

### Over-Engineering

A request for "a breadcrumb component" returns a recursive navigation framework with configuration objects and abstract base classes. The agent optimizes for generality; your project needs specificity. If the solution is more complex than the problem, push back.

### Inconsistent Patterns

The agent doesn't remember your conventions between sessions unless [CLAUDE.md](/guide/claude-md-patterns/) tells it. It might use `camelCase` in one file and `snake_case` in another, or mix async patterns within the same module. Check for consistency with existing code.

### Silent Assumptions

The agent makes decisions without flagging them. It might choose a specific caching strategy, pick a default timeout value, or assume a particular database schema. These assumptions are embedded in the code without comments. Read for implicit decisions, not just explicit logic.

## The Review Checklist

Run through this for every non-trivial diff:

**Does it build?**
```bash
npm run build  # or your equivalent
```
Never merge agent output you haven't built locally. "It looks right" is not verification.

**Does it match the request?**
Compare what you asked for against what you got. Agents frequently add features you didn't request, refactor code you didn't mention, or "improve" things that worked fine.

**Does it follow project conventions?**
- Correct framework/library versions
- Consistent naming patterns
- Same file organization as existing code
- No new dependencies without justification

**Is it the right complexity?**
Count the files changed. If you asked for a simple feature and the diff touches 12 files, something went wrong. The right solution is usually the smallest one that works.

**Are there security concerns?**
- User input sanitized?
- No hardcoded secrets?
- No eval() or equivalent?
- API endpoints validated?

**Does it handle the edge cases that matter?**
The agent often adds error handling for impossible states while missing realistic edge cases. Focus on: what happens with empty data, null values, network failures, and concurrent access.

## The "Read the Diff" Habit

The most important practice: read every diff before accepting it. Not skim — read. (See [Git Commands I Actually Use](/snippet/git-commands-i-use/) for the full reference card.)

```bash
git diff --staged    # what you're about to commit
git diff HEAD~1      # what just landed
```

This sounds obvious. In practice, after hours of productive agent sessions, the temptation to "just accept and move on" is strong. That's exactly when bugs slip through.

:::warning
The agent's confidence is not correlated with correctness. It will present broken code with the same certainty as working code. Your review is the only quality gate.
:::

## After the Review

If you find a problem the agent should have avoided, add a rule to [CLAUDE.md](/guide/claude-md-patterns/). This is how the constraint system grows — through real failures, not hypothetical ones. Every bug that makes it past review is a missing rule.

## Sources

- Anthropic, [Code Review](https://code.claude.com/docs/en/code-review)
- Anthropic, [Common workflows](https://code.claude.com/docs/en/common-workflows)
- Git, [`git-diff` documentation](https://git-scm.com/docs/git-diff)