Skip to main content

Chapter 8 · The Build-vs-Buy Decision in an Agentic World

A framework for making the decision once — and revisiting it when you should.


Why This Decision Is Harder Than It Looks

The build-vs-buy question has always been present in enterprise software. What makes it particularly difficult in agentic AI is the combination of four factors that rarely appear together: rapid capability improvement, fragmented tooling, high switching costs, and genuine uncertainty about what "best" even means for your specific use case.

A vendor solution that covers 80% of your needs today might cover 95% in six months — or might be discontinued. An internal build that perfectly fits your requirements might be rendered obsolete by a model update that changes the economics of the entire approach. You are making a capital allocation decision in a market that is moving fast enough to invalidate your assumptions before the decision is fully implemented.

Gartner's 2025 Hype Cycle places AI agents at the Peak of Inflated Expectations, explicitly flagging that rapidly changing model and tooling options are making it difficult for organisations to define a stable roadmap — a structural condition that sits directly beneath every build-vs-buy decision in this space.2

The market is early: a 2025 McKinsey survey of nearly 2,000 executives found that while 62% of organisations are experimenting with AI agents, only 23% are scaling them anywhere in the enterprise — and in no individual business function are more than 10% of organisations at the scaling stage.3

Key takeaway: Most organisations are still in the experimentation phase with agentic AI, which means any build-vs-buy decision is being made before the market has reached a stable, evaluable state.

This does not mean the decision is arbitrary. It means the decision needs to be structured around durable principles rather than current feature lists.

This chapter closes Part 2 by converting the architectural choices from the previous three chapters into a capital allocation decision. Chapter 5 asked whether the system should be one agent or many. Chapter 6 mapped the platform layers that constrain the build. Chapter 7 showed how background, persistent agents raise the operational stakes. Build-vs-buy is where those choices become budget, ownership, and risk.


The Decision Matrix

The build-vs-buy decision in agentic AI is better understood as four distinct sub-decisions, each of which can be made independently.

Competitive differentiation is the decisive question. If the agentic capability you are building is core to your product or competitive position — a proprietary customer experience, a unique analytical capability, a workflow that encodes genuine institutional knowledge — building it is not just justifiable, it is strategically necessary. Giving a vendor visibility into that workflow or depending on them for its reliability is a strategic risk.

If the capability is operational rather than strategic — expense report automation, IT ticket triage, meeting note summarisation — the build case weakens considerably.


The True Cost of Building

Teams systematically underestimate the cost of building and maintaining agentic systems. The underestimation is not in the initial development — it is in the ongoing operational overhead.

Cost CategoryWhat Teams Often Miss
Prompt engineeringOngoing maintenance as model versions update
Evaluation infrastructureBuilding and running evals to catch regressions
Observability toolingLogging, tracing, and alerting for non-deterministic systems
Safety and guardrailsDeveloping and maintaining content filters, scope controls
Model managementManaging multiple model versions, A/B testing, rollbacks
Incident responseDebugging failures in systems that are hard to reproduce

The difference between a proof-of-concept agentic system and a production-grade one is typically 3–5x the development effort of the initial build. Teams that plan for the POC and then discover the production gap mid-deployment are a common failure pattern.

The question is not "how much does it cost to build?" but "how much does it cost to maintain, improve, and operate over three years?"


The Hidden Risks of Buying

The case for buying is equally subject to underestimation — but of risk rather than cost.

Vendor lock-in at the capability layer is the most significant risk. When a vendor's agent handles your customer interactions, the institutional knowledge about how those interactions should work lives in their configuration, not yours. If you need to switch vendors — because pricing changes, because the vendor is acquired, because a competitor offers better capability — you may find that the accumulated tuning of your agent is not portable. This risk is borne out in practice: a 2025 survey of 100 enterprise CIOs found that agentic workflows have measurably increased model switching costs, with leaders reporting that prompt tuning for multi-step agent tasks is so workflow-specific that migrating to a different model can consume significant engineering time.1

Data exposure is the second major risk. Agentic systems often need access to sensitive data to be useful. A vendor-hosted agent that processes HR data, financial records, or customer information requires careful contractual and technical controls that are harder to enforce in practice than on paper.

Dependency on a single model's behaviour is underappreciated. Vendors that host models update them — and model updates change agent behaviour in ways that are difficult to predict or test for before the change is live. A model that answers your customers' questions one way today may answer them differently in three months.


Hybrid Architectures

For most organisations, the answer is not a binary choice but a layered architecture that buys commodity capability and builds differentiated capability.

In this model, you buy access to foundation models and orchestration infrastructure (the commodity layer), while building the components that encode your specific business logic: the tools your agents use, the data they access, the guardrails that define acceptable behaviour, and the evaluation systems that tell you when something has gone wrong.

This hybrid approach balances speed (vendor commodity capability is available now) with control (your differentiated layer is yours to own and improve).

The Two Orchestration Tiers: A Critical Distinction

The orchestration framework layer — the "Buy" box in the diagram above — is not a single homogeneous market. It has stratified into two tiers with materially different build-vs-buy implications, and treating them as equivalent is a common source of poor platform decisions.

Tier 1: Code-first developer frameworks (LangGraph, Microsoft Agent Framework, CrewAI, OpenAI Agents SDK) require sustained software engineering investment to operate, but deliver the depth, state management, observability, and control that production-grade agent systems demand. For this tier, "buying" the framework means accepting its abstraction model and its lock-in. Teams should evaluate whether the orchestration logic they are encoding is portable — whether the workflow rather than the framework implementation is the durable asset.

Tier 2: Visual and low-code platforms (n8n, Dify) serve a fundamentally different profile: technically capable teams who need to orchestrate agents across business processes without maintaining a full-stack AI engineering function. For this tier, "buying" the platform means trading depth for speed and organisational reach — non-engineers can participate in building and maintaining workflows, and deployment cycles are measured in days rather than sprints. The ceiling is lower, and the lock-in is different in character: workflow definitions tend to be more portable than compiled code, but vendor-specific node configurations create their own migration friction.

The decision question is therefore not just "build or buy orchestration" but "which tier of the orchestration market fits our engineering profile, and what are we building on top of it?" A team choosing a low-code platform for operational automation while maintaining a code-first framework for strategically differentiated workflows is not making a contradiction — it is making a rational tier assignment. Both are covered in detail in Chapter 6.


What Not to Outsource

Even in a buy-heavy strategy, some capabilities should remain internally owned because they encode accountability rather than commodity functionality.

CapabilityWhy It Should Stay Internal
Evaluation criteriaVendors can supply tooling, but only the organisation can define what “good” means in its own risk context.
Domain policyBusiness rules, compliance tolerances, and escalation thresholds are expressions of organisational judgement.
Data access decisionsThird-party tools can enforce permissions, but ownership of who should access what must remain internal.
Incident responseWhen an agent fails, the accountable organisation cannot delegate explanation, remediation, or customer impact management.
Strategic workflow knowledgeThe workflows that create competitive advantage should not become trapped inside a vendor configuration layer.

This does not mean building every component yourself. It means separating commodity infrastructure from institutional judgement. Buy the parts that accelerate execution; own the parts that define responsibility.

A Framework for the Decision

Market data supports the buy direction for most operational use cases: a 2025 Menlo Ventures survey of 495 enterprise decision-makers found that 76% of AI use cases are now purchased rather than built internally, up from 53% the prior year — a shift driven by the maturing application ecosystem rather than reduced confidence in internal teams.4 Enterprise budget data complicates this picture, however: IT-function leaders report allocating roughly 30% of their Gen AI technology budgets to internal R&D, a pattern researchers interpret as evidence that firms are simultaneously building proprietary capabilities rather than relying entirely on vendor solutions.5

Use this framework as a structured starting point, not a definitive formula:

DimensionBuild IndicatorsBuy Indicators
Strategic valueCore to competitive positionOperational, non-differentiating
Data sensitivityHighly sensitive, regulatory constraintsStandard, low-risk data
Customisation depthDeep domain specificity requiredGeneral-purpose use case
Engineering capacityStrong ML/AI team in placeLimited AI engineering capacity — consider Tier 2 low-code platforms (n8n, Dify) as a middle path before full vendor commitment
Speed requirementsTime allows for proper buildNeed capability now
Vendor market maturityMarket immature or fragmentedStrong vendor options exist
Switching cost toleranceLow tolerance for dependencyAcceptable to depend on vendor
Orchestration tier fitComplex, stateful workflows requiring code-first controlBusiness process automation suited to visual/low-code tier

Revisit this decision at least annually. The agentic infrastructure market is evolving fast enough that the right answer in 2024 may be the wrong answer in 2026 — in either direction.


What This Looks Like in Practice: Decision Cases

The examples below are decision cases rather than case studies. Their purpose is to show how the build-vs-buy logic changes depending on differentiation, data sensitivity, engineering capacity, and governance exposure.

ScenarioBetter defaultWhyWhat to keep under internal control
HR policy Q&A agentBuy or configureCommon workflow, mature vendor options, low differentiation if restricted to approved policy retrievalCurrent policy corpus, jurisdiction rules, escalation logic, employee-data permissions
Internal IT service desk agentBuy or low-code firstHigh-volume routine workflows with standard integrations and clear escalation pathsIdentity permissions, approval thresholds, incident routing, audit logs
Proprietary underwriting support agentBuild or hybridEncodes institutional risk appetite, domain judgement, and proprietary data relationshipsScoring logic, data connectors, evaluation suite, approval workflow, audit trail
Customer support agentHybridVendor platforms accelerate channel integration, but domain knowledge and risk boundaries are organisation-specificKnowledge base, escalation rules, tone, regulated-claim controls, customer-data access
Regulated compliance workflowBuild or tightly governed hybridAuditability, explainability, and accountability matter more than speedEvidence chain, decision rights, validation logic, retention policy, regulator-facing records

The decision pattern is consistent: buy where the workflow is common and non-differentiating; build where the workflow encodes competitive advantage, institutional judgement, or regulatory accountability; use hybrid architectures where the commodity layer can be purchased but the risk-bearing logic must remain yours.

FieldDesign question
ProblemIs this a commodity process, a differentiated capability, or a regulated decision path?
SetupWhich layers can be purchased without losing control over the business logic?
Agentic elementDoes the system need adaptive tool use and escalation, or would a fixed workflow be safer?
Tools/dataWhich integrations expose sensitive, proprietary, or regulated data?
Human oversightWhere must approval remain internal regardless of vendor capability?
Main riskBuying speed at the wrong layer can outsource the very judgement the organisation needs to own.

When to Revisit the Decision

Three triggers should prompt a reassessment:

  1. A vendor releases a capability that matches 90%+ of your build — the economics of maintaining a build shift significantly.
  2. Your custom build becomes a maintenance burden — if more than 20% of your AI engineering time is maintaining existing agentic systems rather than building new ones, the cost calculus has changed.
  3. A major model or platform shift changes the foundation — transitions like the GPT-3 to GPT-4 generation, the arrival of open-weight frontier models, or governance shifts such as MCP and A2A moving to Linux Foundation stewardship have historically been moments where many build decisions should have been reconsidered.


Part 2 has moved from internal architecture to external infrastructure, ambient deployment, and sourcing strategy. The next part can now move from “how do we build the agentic stack?” to “what happens when that stack meets people, legacy systems, security boundaries, and the real world?” The build-vs-buy decision is therefore not the end of strategy; it is the point at which strategy becomes operational exposure.

References

  1. Andreessen Horowitz (2025). How 100 Enterprise CIOs Are Building and Buying Gen AI in 2025. Andreessen Horowitz. June 2025.
  2. Gartner (2025). Hype Cycle for Artificial Intelligence, 2025 (ID: G00828523). Gartner, Inc. June 2025.
  3. McKinsey & Company (2025). The State of AI in 2025: Agents, Innovation, and Transformation. QuantumBlack, AI by McKinsey. November 2025.
  4. Menlo Ventures (2025). 2025: The State of Generative AI in the Enterprise. Menlo Ventures. December 2025.
  5. Wharton Human-AI Research & GBK Collective (2025). Accountable Acceleration: Gen AI Fast-Tracks Into the Enterprise. Wharton Human-AI Research & GBK Collective, University of Pennsylvania. October 2025.

Building agentic AI and wondering why alignment is harder than the technology? Get in touch