Every engineering organization has it: a codebase that predates most of the current team, written in patterns nobody uses anymore, maintained through accumulated tribal knowledge rather than documentation, and resistant to change in ways that feel almost structural. Modernizing it is a recognized need. It is also persistently deprioritized because the risk-to-reward ratio on any given quarter looks unfavorable.
AI coding assistants change that calculation in specific, practical ways. They do not eliminate the risk of touching legacy code. But they dramatically reduce the time cost of the tasks that make legacy modernization expensive: understanding unfamiliar code, generating documentation, identifying dependencies, producing initial refactored drafts, and generating test coverage for code that has none.
This guide is about how to use them well - and where they still fail badly enough that you need to catch it.

Photo by Atlantic Ambience on Pexels
What Makes Code "Legacy" in the First Place
"Legacy code" gets used loosely, but the working definition that matters for modernization is practical rather than chronological. Code is legacy when:
- It is difficult to change without causing unintended side effects
- It lacks sufficient test coverage to give you confidence that changes work correctly
- The original developers are unavailable to explain its assumptions
- It uses patterns, libraries, or language versions that are no longer actively supported
Code can be legacy at 6 months if it was written without tests and the author left the company. Code can be actively maintained for 20 years and not qualify as legacy if it has excellent test coverage and clear documentation. Age is a proxy, not the definition.
The practical problem legacy code creates is that every change to it requires expensive context-building before actual work begins. A developer tasked with adding a feature to a system they did not write must first understand what the system does, how the relevant components interact, and what existing assumptions their change might break. In a well-documented, well-tested codebase, this might take an afternoon. In a legacy codebase, it can take weeks.
What AI Coding Assistants Actually Help With
AI coding assistants are useful for legacy modernization in proportion to how well-suited those tasks are to pattern-matching on code structure. They are excellent at some things and genuinely unreliable at others.
Where they help:
- Explaining what a function or module does, at a level of specificity that would take hours to derive from reading alone
- Generating documentation for undocumented code based on its behavior
- Identifying potential dependency relationships between modules
- Suggesting refactored versions of isolated functions using modern patterns
- Generating initial test cases for a function based on its inputs and outputs
- Translating idioms from one language version to another (Python 2 to 3, ES5 to ES2022)
Where they are unreliable:
- Understanding global state and side effects that span multiple files
- Reasoning correctly about the business logic embedded in complex legacy systems
- Generating code that handles all the edge cases a long-running production system has accumulated
- Correctly identifying which refactoring changes are safe versus which ones will break something three modules away
The gap between these two lists is the reason AI-assisted modernization requires human judgment throughout, not just at review. The AI produces useful drafts; an engineer who understands the system must evaluate whether those drafts are actually correct.
Setting Up for Legacy Code Work
Context is the most critical input in AI-assisted legacy modernization. The smaller the context window, the more incomplete the AI's understanding of the code it is analyzing.
Most AI coding assistants work best when you provide:
- The specific function or module you want analyzed (not the entire 50,000-line codebase)
- Enough of the surrounding code for the AI to understand the dependencies
- An explicit description of what you already know and what you are trying to understand
A prompt like "explain this function" on 20 lines of code produces a useful explanation. The same prompt on a 2,000-line file that gets truncated to fit the context window produces a partial and potentially misleading explanation.
Practical approaches for managing context:
- Decompose the legacy codebase into logical modules before starting AI-assisted work. This both limits context size and forces early documentation of what each module is supposed to do.
- Feed the AI the call graph context explicitly when you know a function interacts with others. "This function is called by X and calls Y. Explain what it does and what might break if we change the return type."
- Use the AI to generate an initial dependency map as a starting point, then verify it against the actual code. AI-generated dependency maps are frequently incomplete, but they are faster to verify than to produce from scratch.
Three Phases of AI-Assisted Modernization
Phase 1: Understanding
The first use case is the highest-value one. AI coding assistants dramatically reduce the time it takes to build a working model of what a legacy codebase does. You feed them code; they produce explanations, identify patterns, flag potential issues, and generate documentation drafts.
The output of the understanding phase should be documented and kept in version control. Start with a CLAUDE.md or similar file that captures what the AI helped you understand about each major module. This documentation survives the project and reduces the context-building cost for the next person who has to touch the system.
Martin Fowler's work on refactoring emphasizes understanding as a prerequisite to safe change. The AI does not shortcut the need to understand - it accelerates the understanding phase so you can spend more time on the change phase with confidence.
Phase 2: Refactoring
Once you understand a module, you can use AI assistance to produce candidate refactored versions. The key word is "candidate." AI-generated refactored code is a starting point for review, not a production-ready output.
Effective patterns for AI-assisted refactoring:
- Provide explicit constraints: "Refactor this function to modern async/await syntax. Do not change the function signature or the return type. Preserve all error handling."
- Ask for explanations of the changes alongside the code: "Explain each change you made and why."
- Request the AI identify what it is uncertain about: "Flag any parts of this refactoring where you are not sure the behavior is preserved."
The constraints matter because unconstrained AI refactoring tends to produce cleaner-looking code that subtly changes behavior in ways that are not immediately obvious. OWASP security guidance is particularly relevant here: legacy code often contains security controls that look like inefficient checks to an AI refactoring for cleanliness but actually serve a purpose. Be explicit that security-relevant code should not be simplified without discussion.

Photo by Darkmoon_Art on Pixabay
Phase 3: Validation
The highest-risk aspect of legacy code modernization is verifying that refactored code still behaves correctly in all the cases the original handled - including the ones nobody thought to document.
AI coding assistants can generate initial test suites for legacy code, which is genuinely useful. A function with no tests that gets an AI-generated test suite becomes significantly safer to refactor. But AI-generated tests have a consistent failure mode: they test the happy path and the inputs the AI thinks are obvious, not the edge cases that a decade of production traffic has surfaced.
The practical approach is:
- Use AI to generate initial tests
- Use Git history and issue trackers to find bug fixes and incidents related to the component - these often point to edge cases worth testing explicitly
- Add hand-written tests for any edge cases not covered by the AI-generated suite
- Run the full test suite against the refactored code before merging
For Python legacy code specifically, pytest provides excellent tooling for this workflow. For JavaScript, the same pattern applies with Jest or Vitest. The AI generates the skeleton; you fill in the cases the AI would not know to include.
"The biggest mistake teams make with AI-assisted legacy modernization is treating the AI output as final rather than as a first draft. The AI is fast at producing plausible-looking code, but it has no knowledge of your specific production incidents, your specific data edge cases, or the business logic decisions that were made and then embedded in code without comments. A senior engineer reviewing AI-generated refactors needs to read them like they would read any other PR - carefully, with the context of what the code actually needs to do." - Dennis Traina, founder of 137Foundry
Common Mistakes in AI-Assisted Legacy Work
Refactoring scope creep. AI assistants often suggest improvements beyond the original scope because they pattern-match on "this could be cleaner." Stay disciplined about scope. Change what you need to change; do not accept all suggested improvements in a single pass.
Trusting AI-generated dependency analysis without verification. Dependency maps generated from AI analysis of code that uses dynamic module loading, monkey-patching, or other metaprogramming patterns are often wrong in ways that are not obvious until something breaks in production. Verify against actual runtime behavior.
Generating tests that only cover the AI's model of the code. Tests that pass against the AI's understanding of a function and also pass against the original implementation are not sufficient if the AI's understanding missed cases that the original production traffic included.
Ignoring context window truncation. When the AI only saw part of the code, its explanations and refactoring suggestions are based on a partial picture. Long files should be broken into logical sections for separate analysis.
When AI-Assisted Modernization Makes Sense
Not every legacy system benefits from incremental AI-assisted modernization. The approach works best when:
- The business logic in the legacy system is correct and worth preserving
- The primary problems are structural (missing tests, outdated patterns, poor documentation) rather than fundamental design problems
- The team has at least one engineer who understands the system well enough to evaluate AI output
- Incremental delivery of improved components is acceptable (big-bang rewrites are not required)
When the legacy system has fundamental design flaws that no amount of refactoring will fix, or when the business logic itself is incorrect, modernization in place may be the wrong tool. The web development and AI automation services at 137Foundry include legacy system assessments that help teams make this determination before investing in the wrong approach.
The full 137Foundry services catalog covers the range of technical engagements from system assessment through implementation.

Photo by Vitaly Gariev on Pexels
Practical Starting Points
If you want to start using AI coding assistants on your legacy codebase today, begin with the lowest-risk use cases first:
- Use the AI to generate documentation for your most opaque module. Review it carefully; it will reveal both what the AI got right and the gaps in your own understanding.
- Ask the AI to generate unit tests for a well-understood, stable function. Review the tests for completeness; add edge cases. Commit the tests. Now you have a safety net.
- Pick a small, self-contained function with the new tests protecting it and try an AI-assisted refactor of that function only.
This sequence - document, test, refactor - mirrors the approach Martin Fowler and the refactoring literature have advocated for decades. AI accelerates each step without changing the fundamental sequence that makes legacy modernization safe.
Visit 137Foundry for more engineering guides, and the 137Foundry blog for deeper reading on AI-assisted development workflows and technical decision frameworks.