Agentic Refactoring: Break a Big Refactor Into Parallel Panes
A tutorial for splitting a large refactor across multiple AI panes, coordinating through directory-scoped tickets, and merging results without breaking the build.
April 18, 2026 · 6 min read
The problem
You have a 40,000-line codebase and a refactor that touches every corner of it. Rename User to Account across services. Replace the custom logger with pino. Migrate callbacks to async/await. These are mechanical tasks, but the mechanics are boring and error-prone, and doing them in one terminal with one agent takes hours. The agent re-reads the same files, forgets the pattern halfway through, produces inconsistent output, and burns tokens on repeated context loading.
The right structure for this class of work is parallelism with strong locality. Split the codebase by directory, give each directory to a dedicated agent in its own pane, run them simultaneously, and merge at the end. The agents don't need to coordinate — the refactor rule is the same everywhere, and conflicts only happen where directories overlap (which you design out of the split). What you get is a refactor that finishes in forty minutes instead of four hours, with four sets of eyes on the diff at the end.
The grid setup
3x2 grid on a single large monitor, six panes total. Top row: three Claude Code instances, one per major subdirectory (services/, packages/, apps/). Bottom row: two Codex instances on the remaining directories (tools/, tests/), plus one shell pane for running the build and git commands. All six panes point at the same repo root but each agent is scoped to a specific subdirectory by its starting prompt.
Why Claude on top and Codex on bottom? Personal preference after running this setup a few times. Claude tends to handle the larger, business-logic-heavy directories more carefully; Codex moves faster through tool and test code where the stakes are lower. Your mileage will vary — swap them until you find the pairing that fits your codebase.
Step by step
- Create a new space pointed at the monorepo root, for example
~/code/platform-monorepo. Pick the 3x2 preset. - Before starting any agents, write the refactor rule in a single file at the repo root — call it
REFACTOR.md. Example rule: "Rename theUsertype and all its references toAccount. Update field names:userIdtoaccountId,user_emailtoaccount_email. Preserve database column names (those change in a later migration). Update imports and exports to match." - Create a branch:
git checkout -b rename-user-to-account. - In pane 1 (Claude, services), prompt: "Read
REFACTOR.md. Apply the rule to every file underservices/only. Do not touch any other directory. When done, list the files you changed." - Repeat for panes 2 and 3 with
packages/andapps/. Repeat for panes 4 and 5 withtools/andtests/using Codex. - While the agents work, use pane 6 (shell) to monitor:
watch -n 5 'git status --short | wc -l'. Satisfying to watch the number climb. - When each agent reports done, spot-check its output.
git diff services/in the shell pane. Look for missed spots — runrg '\bUser\b' services/to catch anything the agent left behind. - If an agent missed files, hand them back: "You missed these files: [paste rg output]. Apply the same rule." Agents are good at completing work you point to; they are bad at exhaustive discovery on their own.
- Once all six directories are clean, run
npm run build && npm testin the shell pane. Fix compile errors one at a time, usually in the nearest relevant pane. - Commit in logical chunks, one directory per commit. Easier to bisect later.
What this unlocks
Wall-clock time compression. A refactor that would take a single agent four hours of sequential chugging finishes in under an hour. The human cost is the setup — splitting the work and writing the rule — which is mostly one-time effort.
Consistency across a large change. All agents are following the same REFACTOR.md file, so they produce structurally similar diffs. No more one part of the codebase using account_id and another using accountId because the agent forgot which convention you asked for.
Cheap error recovery. If pane 3 goes off the rails and renames things it shouldn't have, you git checkout -- apps/ and restart just that pane. The other five panes are unaffected. Single-agent workflows don't give you this granularity without frequent commits.
A template for future refactors. Once you've done one directory-split refactor, you have a REFACTOR.md format and a grid layout you can reuse forever. Each subsequent refactor gets faster to set up.
Variations
Two-pane quick refactor. For small refactors touching two or three directories, skip the big grid. Use a 1x2 or 2x1 layout, one agent per directory. Takes ten seconds to set up and gets you most of the parallelism benefit.
Refactor + watcher split. 2x2 grid. Three agents refactor three directories; the fourth pane runs tsc --watch or equivalent, constantly rebuilding. As each agent commits, the watcher tells you if they broke the build. Tight feedback loop.
Lead + followers. One agent in a big pane on the left drafts the refactor rule by doing the first directory manually. Three smaller panes on the right read its diff as a reference and apply the same pattern elsewhere. Use this when the rule is fuzzy and the first pass clarifies it.
Caveats
Splitting by directory only works if your directories are reasonably decoupled. In a monolithic app where everything imports from everything, parallelism doesn't buy you much because every agent ends up fighting the same cross-cutting concerns. Run a dependency analysis first if you're unsure.
Agents are not reliable at global invariants. "Make sure no file still references User" is a property you have to verify with rg after the fact, every time. Do not skip this step.
Token costs balloon. Six agents working for forty minutes can cost more than one agent working for four hours, depending on context window reuse. If budget is tight, run three panes and accept the longer wall-clock time.
FAQ
What if two agents need to edit the same file?
They shouldn't, if you scoped by directory. If the refactor genuinely requires cross-directory edits (say, a shared types.ts that everyone imports), do those edits first, in a single pane, and commit. Then start the grid.
Can I use this for a framework migration, like React class components to hooks?
Yes, and the pattern works even better there because each component file is independent. Hand each agent a list of files to convert. Same REFACTOR.md structure.
Do I need to commit between agents? Commit when you are confident in a directory. Keeping six directories uncommitted and in-flight at once is fine during the work, but commit before you close the space so you don't lose anything to a PTY crash.
Related reading:
Keep reading
- Run Claude, Codex, and Qwen in Parallel on the Same CodebaseA workflow guide for running three AI coding agents at once in a SpaceSpider grid, with each pane working on a different slice of the same repository.
- Multi-Model Code Review: Catch What Any Single AI MissesA review workflow that pipes the same diff through three AI coding CLIs side by side, surfacing bugs and smells that any one model would overlook.
- Debugging With AI: Three Hypotheses in Three PanesA debugging workflow that runs three parallel AI agents on the same bug, each exploring a different hypothesis, with a shared shell for log inspection.
- Frontend and Backend AI Pair on the Same Feature, Side by SideA full-stack development workflow with dedicated AI panes for the frontend, the backend, and a live API tester, all sharing the same repo and feature branch.
- Cost-Optimized AI Coding: Cheap Model for Grunt Work, Smart Model for Hard CallsA cost-aware development workflow that routes routine edits to cheaper AI CLIs and reserves premium models for architecture decisions and hard debugging.
- Team Workflows: Shared AI Coding Grids for Pairing and ReviewA case study on how a six-person team uses SpaceSpider grids for pair programming, PR review, and on-call rotations, with shared layouts committed to the repo.