Style Graph

The Codebase Style Graph: A Technical Explainer

By Tariq Osei · December 2, 2024 · 9 min read

Abstract node graph visualization representing codebase style relationships in amber and white on dark background

Every production codebase has style. Not just the style that ESLint and Prettier enforce — the deeper style that accumulates through years of team decisions: how services communicate with each other, what naming patterns signal "this is public API" vs. "this is an internal helper," how error handling is structured across similar components, when a team prefers composition over inheritance. None of this is in a ruleset. It lives in the code itself.

A codebase style graph is a structured representation of those implicit patterns. Built from the actual commit history and AST structure of a repository, it maps the relationships between naming conventions, abstraction layers, call patterns, and project-specific conventions. When Replixa runs a review, it's checking an incoming diff against this graph — not against a universal ruleset.

Why static linters are solving a different problem

ESLint, Pylint, RuboCop, and their equivalents are excellent at what they do: enforcing rules that are true across all JavaScript, all Python, all Ruby. No unused variables. Consistent brace style. No implicit type coercions. These are universal rules — they apply equally to a 10-line script and a 5-million-line monorepo.

But a 5-million-line monorepo has properties that no universal ruleset can capture. Consider a backend services repository where one team has consistently used the pattern XxxService for stateless computation and XxxManager for stateful coordination. That pattern isn't in any linter. A new engineer who introduces AuthManager for a stateless authentication helper hasn't violated any ESLint rule — but they've introduced a subtle inconsistency that will confuse future engineers trying to understand the system's structure.

Static linters are also fundamentally backward-looking: they check against rules that someone wrote explicitly. They can't detect that your codebase treats all async operations a particular way, or that your team has adopted a convention around how database entities are structured, unless someone wrote a custom lint rule for each pattern. Custom lint rules require maintenance, documentation, and enforcement — they're expensive to write and brittle over time.

How the style graph is constructed

The style graph construction process starts with the codebase's git history and AST. At a high level, it involves three passes:

Pass 1: Symbol extraction

Replixa parses each file in the repository into an abstract syntax tree and extracts symbols: function names, class names, interface names, module identifiers, type definitions, variable names in specific syntactic positions. Each symbol is tagged with its structural context — is it exported? Is it used across module boundaries? Is it in a test file? Is it part of a specific directory namespace?

Pass 2: Pattern clustering

Extracted symbols are analyzed for naming patterns. This is where the graph starts to take shape. Pattern clustering identifies that, for example, getXxx functions in a given module consistently return specific types, that handleXxx functions consistently appear in event handler contexts, that class names ending in Repository consistently implement a specific interface pattern. These clusters become nodes in the graph.

This pass also identifies structural patterns at the module level: how modules import from each other, what the typical depth of call chains is, how error propagation patterns differ across subsystems. A monorepo with a payments service and a notifications service might have meaningfully different calling conventions in each — the graph captures both as distinct subgraphs.

Pass 3: Boundary and change-history weighting

Not all patterns carry equal weight. Patterns that appear consistently across hundreds of commits, that were introduced early in the codebase history, and that have never been revised are high-confidence conventions. Patterns that appear in a single engineer's recent commits are lower confidence. The style graph weights nodes by signal strength — high-confidence patterns generate higher-confidence review suggestions; low-confidence patterns generate suggestions with appropriate hedging.

Change history also helps identify intentional divergences. If a subsystem has consistently used a different pattern than the rest of the codebase over 50+ commits, the graph recognizes that as a deliberate local convention rather than an error. We're not trying to homogenize a codebase — we're trying to catch unintentional deviations from whatever the codebase's actual conventions are.

What the graph catches that linters miss

Consider a real scenario from a TypeScript backend codebase (a growing platform engineering team with ~80 engineers). Their convention was to wrap all external API calls in a typed Result<T, E> type — a pattern they'd adopted consistently across their service layer for two years. A new engineer submitting a PR for a third-party notification service integration used the older Promise<void> pattern with a try/catch, because that's what the third-party SDK documentation showed. No linter flags this — both patterns are valid TypeScript. The codebase style graph flags it immediately, with an inline suggestion showing the project-consistent Result wrapper pattern.

Other categories the graph catches that linters miss:

Naming convention violations specific to a subsystem. A service that names all its public exports with a specific prefix — cfg_xxx for configuration values, for example — will have the graph flag a PR that adds configXxx to the same module.
Import pattern inconsistencies. If a module consistently imports from a specific internal path alias rather than a relative path, a PR that introduces relative imports into that module is inconsistent with local convention.
Abstraction layer violations. If domain logic has never appeared in the route handler layer, and a PR introduces a complex business rule directly into a controller function, the style graph can flag this as a potential layering inconsistency — not as a lint error, but as a stylistic observation worth discussing in review.
Test structure divergence. Teams often develop consistent patterns for structuring unit tests: naming conventions for describe blocks, how test fixtures are set up, whether mock setup happens in beforeEach or inline. A PR that structures tests differently from the rest of the repository is caught by pattern divergence detection.

What the style graph is not

We're not suggesting the style graph replaces human architectural review. A graph-based analysis of naming and structural patterns can't reason about whether a design decision is correct in a deeper sense — whether a particular abstraction boundary is appropriate for the problem domain, whether a data model will support future requirements, whether a particular algorithm is efficient enough for the expected load.

The style graph is also not a source of ground truth about what your codebase should look like. If a team has accumulated bad patterns over time — and every codebase has some — the graph will faithfully represent those bad patterns. The graph describes what is, not what should be. Teams that want to migrate away from legacy patterns need to do that explicitly, not via the style graph.

The value is narrower and more specific: making reviewers aware when a PR diverges from established local convention, in a context where that divergence is likely unintentional. That's a genuine gap in the current tooling ecosystem, and it's the gap the style graph is built to close.

Graph maintenance and drift

Codebase style isn't static. Teams evolve conventions, adopt new frameworks, refactor legacy patterns. The style graph needs to reflect these changes rather than cementing historical patterns forever.

Replixa rebuilds the style graph incrementally as new code is merged. A convention that was universal six months ago but has been systematically replaced over the past 50 commits will have decreasing weight in the graph — its suggestions will be softer and eventually suppressed. A new convention that a team has adopted consistently over the past 30 PRs will gain weight and start generating higher-confidence suggestions.

Teams can also explicitly signal intent through the Replixa config: marking a directory as undergoing a planned refactor will suppress style graph suggestions in that scope, preventing the graph from noisily flagging intentional changes. Conversely, marking a subsystem as high-convention-stability will raise the confidence threshold for suggestions, ensuring that only strong pattern deviations generate review comments there.

The result is a review signal that tracks the actual state of your codebase's style rather than the state of your linter config file — which for most organizations diverged from the actual codebase a long time ago.