Building a context engine for real codebases

Matt Legrand, Head of Design, Augment Code

October 15, 2024·7 min read

AI Developer Tools Product Design Code Intelligence·

317 embeddings

Every AI coding assistant uses the same foundation models. Context is what separates the useful from the useless.

If you’ve used AI coding tools on anything beyond toy projects, you’ve probably experienced the same frustration: they work brilliantly on isolated functions but completely fall apart when they need to understand how your actual system fits together. The AI suggests changes that look reasonable in isolation but break when integrated into your real codebase.

The problem isn’t the underlying models from OpenAI, Anthropic, or Google. It’s context. When an AI only sees a few dozen lines around your cursor, it can’t understand your architecture, existing patterns, or how changes ripple through your system. You end up constantly re-explaining your codebase, manually finding relevant files, and fixing suggestions that ignore critical dependencies.

At Augment, we set out to solve this from first principles: what would it take to build an AI that truly understands large, complex codebases?

The grep problem: Why most agents fail on real tasks

Most AI coding tools rely on basic keyword search, which is essentially grep with better UX. When you ask them to implement a feature, they search for relevant-looking file names and function names, dump whatever they find into the context window, and hope for the best.

This approach has fundamental limitations:

They don’t know what they don’t know. Grep can only find exact matches. It misses semantic relationships, architectural patterns, and implicit dependencies. If your authentication logic is in a file called security.ts but the AI searches for “auth,” it won’t find it.

They confuse proximity with relevance. Finding files that mention “payment” doesn’t mean you’ve found the right payment logic. Large codebases often have multiple payment implementations, legacy code, test fixtures, and documentation that all match the same keywords.

They can’t see patterns across the codebase. Understanding how your team handles error logging, database transactions, or API versioning requires analyzing dozens or hundreds of files to extract the common patterns. Grep gives you individual matches, not the bigger picture.

The result? AI agents that start strong but degrade quickly. They make changes that work in isolation but break integration tests. They reinvent patterns that already exist elsewhere in your codebase. They require constant hand-holding and re-explanation.

Augment’s Context Engine maintains a real-time semantic index of your entire codebase, understanding not just what code exists but how different pieces relate to each other. When you ask it to implement a feature, it automatically surfaces the relevant architecture, patterns, and dependencies.

Building a semantic understanding of code

We built Augment’s Context Engine as a specialized search system for code. Not keyword matching, but semantic understanding of how software systems are structured and how they evolve.

Real-time indexing across your entire stack

The Context Engine maintains a live index of your codebase, tracking not just file contents but the relationships between different pieces of code. When you’re working on authentication, it knows to surface:

Your existing auth middleware and how it’s configured
Similar authentication patterns used elsewhere in the codebase
Related tests and how they’re structured
Configuration files and environment variables that affect auth
Recent changes to authentication logic and why they were made

This isn’t magic. It’s a purpose-built indexing system that understands code structure. We parse your codebase to build a graph of dependencies, analyze commit history to understand evolution, and use embeddings to capture semantic relationships that go beyond syntax.

Understanding patterns, not just files

When you ask Augment to “add logging to payment requests,” it doesn’t just search for files containing “payment” and “logging.” It maps the entire request path:

The React component that initiates the payment
The API endpoint that receives the request
The payment service that processes it
The database transactions involved
The webhook handlers that confirm completion

Then it analyzes how your team already handles logging in similar flows. Does your team use structured logging? What log levels do you use for different scenarios? How do you handle sensitive data in logs? The Context Engine extracts these patterns and applies them consistently.

Beyond the repository: Understanding your team’s reality

Code doesn’t exist in isolation. Real software development involves commit history, issue trackers, documentation, design decisions, and tribal knowledge that never gets written down. We built the Context Engine to understand all of it.

Commit history: Why changes were made

Looking at your Git history, the Context Engine learns not just what changed but why. When someone refactored the authentication system six months ago, that context matters. When a particular pattern was introduced to fix a production bug, that’s valuable information.

This historical understanding prevents the AI from suggesting changes that repeat past mistakes or undo carefully considered decisions.

Codebase patterns: How your team actually builds

Every team has conventions that go beyond the style guide. Maybe you always wrap database calls in a specific error handling pattern. Maybe you have a particular way of structuring API responses. Maybe you use feature flags in a specific way.

The Context Engine learns these patterns by analyzing your existing code. It doesn’t impose generic “best practices.” It learns how your team actually builds and maintains consistency with those patterns.

External sources: Docs, tickets, and design decisions

Through native integrations with GitHub, Linear, Jira, Notion, and Confluence, the Context Engine pulls in context from across your development workflow. When you’re working on a feature, it can reference:

The original ticket that describes the requirements
Design documents that explain the architecture
Previous discussions about implementation approaches
Related issues and how they were resolved

We also support the Model Context Protocol (MCP), an emerging standard for connecting AI systems to external tools. This means Augment can connect to virtually any system in your development stack, including monitoring tools, deployment platforms, internal documentation, and custom APIs.

Tribal knowledge: What never gets written down

Some of the most valuable context never makes it into documentation. Edge cases discovered through painful debugging sessions. Performance gotchas that everyone on the team knows about. Subtle interactions between different parts of the system.

The Context Engine discovers this tribal knowledge through deep codebase analysis. It finds the patterns in how your team handles edge cases, the defensive checks that appear in critical paths, the comments that explain non-obvious decisions.

The Infinite Context Window: Signal over noise

Having access to your entire codebase doesn’t mean dumping everything into the AI’s context window. That would be overwhelming and counterproductive. The Context Engine’s job is to find exactly what matters for your specific request.

Intelligent retrieval and ranking

When you ask Augment to implement a feature, the Context Engine:

Retrieves relevant context using semantic search, not just keyword matching
Ranks by relevance based on how closely each piece of code relates to your request
Compresses without losing information by extracting the essential patterns and relationships
Respects access permissions ensuring the AI only sees code you have access to

The result is what we call the Infinite Context Window. From the user’s perspective, it feels like the AI has perfect knowledge of your entire codebase. In reality, it’s seeing a carefully curated subset, but it’s the right subset.

Proof in production: The Elasticsearch study

We validated the Context Engine’s effectiveness through a blind study comparing AI-generated pull requests to human-written code. We used the Elasticsearch repository (3.6 million lines of Java from 2,187 contributors) as our test case.

We generated 500 pull requests using Augment and compared them to merged code written by humans. The results: Augment’s pull requests matched or exceeded human code quality, significantly outperforming other AI coding tools.

The difference wasn’t the foundation model. It was context. Augment understood the Elasticsearch codebase well enough to write code that fit naturally into the existing architecture, followed established patterns, and handled edge cases appropriately.

Cursor Claude Code Augment Code