Chapter 10 · Grafting Intelligence onto Legacy: Integration Strategies

Most organisations will not get to start fresh. The question is not how to build an agentic enterprise from scratch, but how to introduce intelligence into one that already exists.

The Legacy Reality

When practitioners talk about deploying AI agents, the conversation tends to assume a clean environment: modern APIs, structured data, cloud-native infrastructure. The actual environment most enterprise agents encounter is different — a decades-long accumulation of systems that were designed without agents in mind, that predate modern API conventions, and that contain the organisation's most valuable operational data precisely because they have been in use for so long.

Legacy systems are not a transitional problem to be solved before the real work begins. For most organisations, integrating agents with legacy infrastructure is the real work. Understanding the patterns available, and the trade-offs each carries, is foundational to any serious agentic deployment programme.

What Makes Legacy Systems Resistant to Integration

Not all legacy systems are equally difficult to integrate with. Bisbal et al. offer a useful working definition: a legacy system is not defined by its age or technology stack, but by how strongly it resists change — which is why some relatively modern systems can be just as hard to integrate with as systems built decades ago.²

Key takeaway: A system doesn't have to be old to behave like a legacy system — what makes it "legacy" is how hard it is to change, not when it was built.

The resistance patterns fall into four categories:

Interface absence — the system was not designed for programmatic access. It has a user interface and perhaps a proprietary protocol, but no modern API. Interaction requires either screen-scraping, RPA, or waiting for the vendor to provide an integration layer.

Data model incompatibility — the system's data model was designed for a different era and does not map cleanly onto the structured inputs agents expect. Dates stored as text, identifiers that have changed meaning over time, fields that are used differently across business units.

Authentication architectures — many legacy systems use authentication models (session-based, IP-restricted, single sign-on configurations) that do not accommodate agent access patterns, where the "user" is a process rather than a person.

Operational fragility — legacy systems are often running at the boundary of their designed capacity. Agentic workloads can drive dramatically different usage patterns: high-frequency API calls, parallel requests, off-hours activity. Systems that were stable under human interaction patterns can become unstable under agent-driven load.

The Anti-Corruption Layer Principle

Before choosing a technical integration pattern, teams need one architectural principle in place: the agent should not be exposed directly to the legacy system's internal vocabulary, inconsistent states, and historical compromises. Domain-driven design calls this protective boundary an anti-corruption layer — a translation layer that prevents the assumptions of one system from leaking into another.⁵

For agentic systems, the anti-corruption layer is especially important because language models will try to infer meaning from whatever labels and values they receive. A field named status, a dropdown value labelled closed, or an exception message written for an internal operator can all be misinterpreted if they are passed directly into the agent context. The integration layer should translate legacy structures into explicit, documented, agent-safe concepts before the model sees them.

In practice, this means:

converting ambiguous field names into clear semantic labels;
normalising values before retrieval or tool execution;
hiding internal implementation details that the agent does not need;
exposing narrow tools with business-level names rather than raw database or screen-level operations;
documenting the assumptions the translation layer makes.

Key takeaway: Do not let the agent learn the legacy system's confusion. Put a translation boundary in place so the agent interacts with a stable business interface, not decades of accumulated implementation history.

Integration Patterns

Five integration patterns cover the majority of enterprise legacy scenarios. They are not mutually exclusive — most real deployments combine elements of several.

1. API Wrapping

The most direct approach: build a modern REST or GraphQL API layer that abstracts the legacy system's underlying interface and exposes it in a form that agents can consume.

Best for: Systems with a stable, queryable underlying data store — even if the interface is poor — and sufficient engineering access to build the wrapper.

Limitations: The wrapper becomes a maintenance liability. Every change to the legacy system may require a corresponding wrapper update. For highly volatile legacy systems, this creates ongoing overhead. Bisbal et al. go further, noting that widely deployed wrapping solutions can actively compound the maintenance problem rather than simply adding to it, because the wrapper layer must itself be maintained while the underlying system continues to age.²

Key takeaway: Wrapping a legacy system doesn't freeze the problem — it adds a second layer that needs maintaining on top of a first layer that still needs maintaining.

2. RPA Bridge

Robotic Process Automation tools can drive a legacy user interface the same way a human would — clicking buttons, filling forms, reading screen content. An AI agent can orchestrate an RPA tool to interact with systems that have no programmable interface at all. Van der Aalst et al. describe this as working "outside-in": RPA interacts with a system from the outside without touching what is underneath — which is why it can be deployed quickly, but also why the fragility never fully goes away.³

Best for: Systems with no API, where replacement is not feasible and the interaction patterns are predictable enough to be scripted reliably.

Limitations: RPA is fragile to UI changes. A redesign of the legacy interface can break the automation. It is also slow relative to direct API access, and difficult to parallelise. Van der Aalst et al. also flag a subtler risk: an RPA process that continues to run after a contextual change may produce incorrect results that go undetected for some time — a concern that becomes more acute when an AI agent, rather than a human operator, is responsible for monitoring the output.³ Treat it as a tactical bridge, not a long-term architecture.

Key takeaway: The danger with RPA isn't just that it breaks when something changes — it's that it can keep running silently while producing wrong results.

3. Event-Driven Integration

Rather than having agents poll legacy systems for changes, instrument the legacy system to emit events when relevant state changes occur. The agent subscribes to these events and responds accordingly.

Best for: Use cases where the agent needs to react to changes in a legacy system rather than query it on demand — order status changes, inventory alerts, approval workflow triggers.

Limitations: Requires instrumentation of the legacy system, which may not always be possible. Change Data Capture (CDC) tooling — a technique that tracks and records every change made to a database and makes those changes available as a stream of events — helps but adds infrastructure complexity.

4. Data Replication

Extract data from legacy systems into a modern data store — a data lake, a vector database, a search index — and have agents query the modern store rather than the legacy system directly.

Best for: Read-heavy use cases where agents need to search, retrieve, and reason over legacy data, but do not need to write back to it. Knowledge management, document retrieval, historical analysis.

Limitations: Creates a synchronisation problem. The replicated data is always somewhat behind the source of truth. For use cases where recency matters, this gap is a reliability risk.

5. Strangler Fig Modernisation

Coined by Martin Fowler, the strangler fig pattern replaces a legacy system incrementally. New functionality is built as modern, agent-compatible services. Traffic is progressively routed to the new services until the legacy system can be retired — or "strangled."

Fowler cautions that technical modernisation without accompanying changes to organisational culture and development practices tends to reproduce the same structural problems in the new system — making the strangler fig as much an organisational intervention as an architectural one.¹ The underlying mechanism is Conway's Law: organisational structures tend to produce systems that mirror how those organisations communicate, so a team that hasn't changed how it works will tend to build the same patterns into whatever replaces the legacy system.¹

Key takeaway: Modernising the technology without modernising how the team works tends to recreate the same problems in the new system.

Best for: Organisations with a long-term commitment to modernising a specific system, sufficient engineering capacity for parallel maintenance during transition, and use cases that can be decomposed cleanly into new and legacy function areas.

Limitations: Requires sustained engineering investment over an extended period. The "never finished" modernisation project is a common failure mode of this pattern.

Data Harmonisation

Regardless of integration pattern, agents connecting to legacy systems face a data quality and consistency challenge that is often more significant than the interface challenge.

Common data harmonisation problems in enterprise legacy integration:

Problem	Example	Mitigation
Format inconsistency	Dates as "01/05/24" vs "2024-05-01"	Normalisation layer before agent consumption
Entity mismatch	"Customer ID" means different things in different systems	Entity resolution layer; canonical identifiers
Semantic drift	A field called "status" has different values in different systems	Schema documentation; value mapping tables
Missing data	Fields that should be populated are frequently null	Explicit null handling in agent prompts and tools
Duplicate records	Same entity represented multiple times	Deduplication pipeline upstream of agent

Investing in data harmonisation before building agent capability is almost always the right sequence. An agent built on inconsistent data will produce inconsistent outputs — and the resulting errors will be attributed to the agent rather than the data. Wang and Strong's research on data quality is a useful reminder that accuracy is only one of four things that matter: data also needs to be relevant to the specific task, consistently formatted, and accessible — which means harmonisation work should be shaped by what the agent will actually do, not by a general idea of "clean data".⁴ Their framework also treats timeliness as a distinct contextual quality dimension: data that is technically accurate but stale fails the contextual test for any task where recency matters, which is precisely the risk the data replication pattern introduces.⁴

Key takeaway: "Clean data" has four dimensions — intrinsic accuracy, contextual fit (including timeliness), representational clarity, and accessibility — and an agent can fail on any of them even when the data looks correct in the database.

Agentic integration becomes substantially riskier when a system moves from reading legacy data to changing legacy state. A useful rollout sequence is therefore a three-step permission ladder:

Stage	Agent Capability	Appropriate Use	Promotion Criterion
Read	Query and summarise legacy information	Knowledge retrieval, diagnostics, reporting	Retrieval accuracy and source traceability are validated
Recommend	Propose an action but require human approval	Refund suggestions, ticket routing, invoice exception handling	Human reviewers consistently approve and trust the recommendation logic
Write	Execute changes directly through controlled tools	Low-risk updates, reversible actions, mature workflows	Rollback, audit, and exception handling are proven in production-like tests

This ladder prevents a common failure pattern: giving an agent write access because the read-only prototype looked impressive. Reading a record correctly and changing that record safely are different levels of operational maturity.

Managing Risk During Integration

Legacy systems often underpin critical business processes. Introducing agentic access creates new risk vectors that must be managed explicitly.

Read-before-write. In early integration phases, configure agents with read-only access to legacy systems. Validate that the agent's understanding of the data is accurate before allowing it to write or trigger changes.

Shadow mode testing. Run the agent in parallel with existing processes — executing actions but not committing them — and compare agent outputs against human outputs. This surfaces discrepancies before they reach production.

Rollback capability. Design integration points with the assumption that you will need to roll back agent-initiated changes. This requires thinking about transaction boundaries and audit logging from the outset.

Load testing under agent patterns. Legacy systems were sized for human usage patterns. Test them under the load profiles that agent-driven access will create — before going live.

The integration patterns in this chapter describe how agents can reach legacy systems without destabilising them. The next chapter goes one level deeper into the engineering details that determine whether those integrations actually survive production: credentials, tool protocols, error propagation, observability, latency, and security boundaries.

References

Fowler, M. (2024). Strangler Fig. martinfowler.com. https://martinfowler.com/bliki/StranglerFigApplication.html
Bisbal, J., Lawless, D., Wu, B., & Grimson, J. (1999). Legacy Information Systems: Issues and Directions. IEEE Software, 16(5), 103–111. https://doi.org/10.1109/52.795108
van der Aalst, W.M.P., Bichler, M., & Heinzl, A. (2018). Robotic Process Automation. Business & Information Systems Engineering, 60(4), 269–272. https://doi.org/10.1007/s12599-018-0542-4
Wang, R.Y. & Strong, D.M. (1996). Beyond Accuracy: What Data Quality Means to Data Consumers. Journal of Management Information Systems, 12(4), 5–33.
Evans, E. (2003). Domain-Driven Design: Tackling Complexity in the Heart of Software. Addison-Wesley.

The Legacy Reality​

What Makes Legacy Systems Resistant to Integration​

The Anti-Corruption Layer Principle​

Integration Patterns​

1. API Wrapping​

2. RPA Bridge​

3. Event-Driven Integration​

4. Data Replication​

5. Strangler Fig Modernisation​

Data Harmonisation​

Read, Recommend, Write: A Safer Integration Ladder​

Managing Risk During Integration​

References​

Further Reading​