Mapping Technical Debt: How Semantic Graphs Reveal What Coding Agents Miss

Key Takeaways

Technical debt is fundamentally a structural problem, rooted in the complex relationships between components, not just in lines of code.
Current AI coding agents (even future GPT-5 or Claude 4 models) have a critical blind spot: they excel at local analysis (fan-out) but fail to grasp the global, repository-wide impact of a function (fan-in), making them dangerous for complex refactoring.
Converting a codebase into a graph database makes this invisible structural debt visible and queryable, allowing us to precisely identify high-risk functions and architectural bottlenecks.
Epsilla's Semantic Graph is the enterprise-grade solution, extending this concept beyond code to include documentation, tickets, and APIs, providing the holistic awareness Agent-as-a-Service (AaaS) platforms need to operate safely and effectively.

The promise of Agentic AI in software development is intoxicating. We envision a future where autonomous agents, powered by models like GPT-5 or Llama 4, can independently diagnose bugs, refactor legacy systems, and implement new features. Yet, for any engineering leader who has managed a system of non-trivial complexity, this vision is shadowed by a single, terrifying question: how can we trust an agent that doesn’t understand the architecture?

The current generation of coding agents, for all their fluency, operates with a profound structural blindness. They see the code, but they don't see the system. They can analyze the contents of a function—what it does, what other functions it calls—with remarkable precision. This is the concept of "fan-out." But they have almost no inherent understanding of the most critical risk factor: who calls this function? This is "fan-in," a measure of a function's systemic importance, its "blast radius" if something goes wrong.

Calculating fan-out is easy; it’s a local analysis contained within a single file or context window. Calculating fan-in requires a holistic, repository-wide scan. It requires understanding the entire system as an interconnected graph. Without this map, we are handing our most powerful agents a scalpel in the dark. A recent open-source experiment analyzing the OpenClaw repository provides a stark, quantitative look at this problem—and points directly to the solution.

From Code Lines to a Queryable System Graph

To move beyond superficial analysis, a fascinating experiment was conducted using an open-source tool called "CodeGraph." The entire OpenClaw codebase—over 6,000 source files, 21,000 functions, and 36,000 call relationships—was ingested and modeled as a graph database. In this model, every function becomes a node, and every call or import becomes a directed edge connecting them.

Suddenly, the abstract concept of "technical debt" became a tangible, queryable structure. Questions that were previously impossible to answer without weeks of manual archaeology could now be resolved with a single Cypher query. This graph-based approach doesn't just look at code; it reveals the hidden nervous system of the application.

The first and most powerful insight came from defining a function's "risk value." The formula is devastatingly simple and effective: Risk Value = fan_in (times called) × fan_out (times it calls others).

This metric brilliantly captures the "blast radius." A function with high fan-out is complex, but if it's only called from one place, the risk is contained. A utility function with high fan-in is critical, but if it's simple (low fan-out), it's less likely to break. The true danger lies in the functions where both are high—the complex, system-critical hubs.

The analysis of OpenClaw immediately flagged startGatewayServer (Risk Value: 1030) and createConfigIO (Risk Value: 1008) as extreme hotspots. The createConfigIO function, for example, is called from 18 different places and, in turn, calls 56 other functions. This is precisely the kind of function that is rarely covered by unit tests but can bring down the entire system with a one-line change. An LLM agent, tasked with modifying a configuration setting, would see the 56 outbound calls (fan-out) but be completely oblivious to the 18 inbound dependencies (fan-in). It would operate without understanding the consequences, a recipe for catastrophic failure in an enterprise environment.

The Invisible Tax of Zombie Code and Architectural Knots

The graph revealed more than just function-level risk. A query for all functions with a fan-in of zero—meaning they are never called by any other part of the codebase—yielded a shocking result. After filtering for legitimate entry points, over 2,000 functions, or roughly 20% of the non-entry-point code, were identified as "zombie code."

This isn't just benign clutter. Zombie code imposes a significant cognitive tax on development teams. A new engineer (or an AI agent) stumbles upon a function like assertPublicHostname, assumes it's important, and wastes hours trying to understand its purpose or, even worse, builds new functionality on top of it, only to discover later that it was deprecated years ago. This is a silent productivity killer, a structural debt that accumulates interest in the form of wasted engineering hours.

Furthermore, by analyzing dependencies at the module level, the graph exposed a critical architectural bottleneck. A staggering 42% of recent bugs in OpenClaw originated in a single module: src/agents. A query of the graph showed why: over 30 other modules had a tight, bi-directional dependency on src/agents. It had become a god object at the module level, a centralizing hub through which most of the system's business logic was forced to pass. The high bug rate wasn't a coincidence; it was a direct consequence of a flawed architecture made visible by the graph.

Epsilla: The Semantic Graph for Enterprise-Ready AaaS

This CodeGraph experiment is a powerful proof of concept, but it stops short of what enterprises truly need. A codebase is not an island. It exists within a rich context of documentation, API specifications, Jira tickets, Slack conversations, and commit histories. To make an Agent-as-a-Service (AaaS) platform truly safe and effective, it needs access to this entire universe of knowledge, structured in a way it can understand and query.

This is precisely why we are building Epsilla. We are moving beyond simple code graphs to create a comprehensive, enterprise-wide Semantic Graph.

In the Epsilla paradigm, the graph doesn't just contain function nodes and call edges. It connects a function node to the Jira ticket that prompted its creation, the design document that specifies its behavior, the API endpoint that exposes it, and the Slack channel where its performance is debated.

Now, imagine an advanced AaaS platform integrated with Epsilla. Before refactoring the high-risk createConfigIO function, the agent doesn't just blindly edit code. It first queries the Semantic Graph via a Model Context Protocol (MCP):

QUERY: GET fan_in, fan_out FOR function('createConfigIO')
QUERY: GET downstream_dependencies, business_unit_owner FOR function('createConfigIO')
QUERY: FIND related_tickets WHERE status='OPEN' AND priority='CRITICAL'
QUERY: GET test_coverage_percentage

The agent receives a structured response: Fan-in is 18, fan-out is 56. It's owned by the Core Infrastructure team. It's a dependency for the Payments and Analytics services. There are two open critical tickets related to its performance. Test coverage is only 15%.

Armed with this holistic, structural awareness, the agent can now make an intelligent, founder-level decision. It can flag the change as high-risk, notify the owning team, demand an increase in test coverage before proceeding, and link its work directly to the relevant tickets. It transitions from a naive code-writer to a responsible, context-aware engineering partner. This is the difference between a clever tool and an enterprise-ready system. The Semantic Graph provides the judgment layer that raw LLM intelligence lacks.

The future of software development isn't just about generating code faster. It's about building and maintaining complex systems more intelligently. The structural debt laid bare in the OpenClaw analysis is present in every legacy codebase. By mapping this debt and making it queryable, we give both our human developers and our AI agents the situational awareness they need to navigate it safely. The next leap in productivity will come not from a more powerful LLM, but from a more comprehensive map of the systems they are meant to improve.

FAQ: Agentic Code Analysis and Semantic Graphs

Why can't LLMs just read the whole codebase to understand fan-in?

Even with massive context windows, LLMs process information sequentially, not structurally. A full scan is computationally expensive and doesn't create a queryable graph of relationships. A Semantic Graph pre-processes the code into a structure optimized for instantly answering complex dependency and impact analysis questions.

What's the difference between a simple code graph and Epsilla's Semantic Graph?

A code graph maps functions and calls—what the code is. Epsilla's Semantic Graph is far richer, connecting code to its business context: documentation (why it exists), tickets (its history and issues), and APIs (how it's used). This provides holistic understanding for safer, smarter automation.

How does this make Agent-as-a-Service (AaaS) platforms safer for enterprise use?

It provides a critical safety and validation layer. Before executing a code change, an agent can query the Semantic Graph to understand the full "blast radius" across the system and business. This prevents agents from making locally correct but globally catastrophic changes to critical production infrastructure.

Mapping Technical Debt: How Semantic Graphs Reveal What Coding Agents Miss

From Code Lines to a Queryable System Graph

The Invisible Tax of Zombie Code and Architectural Knots

Epsilla: The Semantic Graph for Enterprise-Ready AaaS

FAQ: Agentic Code Analysis and Semantic Graphs

Ready to Transform Your AI Strategy?