Part 4 — Failure, Security, and Information Integrity
Overview
Capability and risk are not separable. Every property that makes an agent useful — autonomous action, broad tool access, persistent operation, and the ability to communicate fluently — also creates a surface for failure, exploitation, and harm.
Part 4 examines the risk landscape of agentic AI across three levels. First, it looks at intrinsic failure: the perception gaps, hallucinations, planning failures, calibration problems, and context limitations that arise from how agents actually work. Second, it examines adversarial exploitation: the ways attackers can turn those weaknesses into prompt injection, tool abuse, data exfiltration, and cross-agent compromise. Third, it considers systemic harm: what happens when agents generate, retrieve, summarise, and amplify information at machine speed inside already fragile information environments.
These risks are not speculative. Many have already materialised in early deployments, and all become more consequential as agent autonomy increases. The chapters in this part are not cautionary tales meant to discourage deployment. They are an honest account of the failure modes practitioners need to understand before they can design around them. An organisation that has read these chapters is better positioned to deploy agents safely than one that has not — not because deployment becomes easier, but because the risks become legible.
The arc of this part is deliberate: unintentional failure, adversarial exploitation, and systemic impact. Chapter 12 examines the intrinsic fallibility of agents — where perception gaps, misclassification, hallucination, planning errors, and context drift originate, and how they manifest in production. Chapter 13 maps the attack surfaces that agentic architectures introduce, with particular attention to prompt injection, tool abuse, data exfiltration, and cross-agent trust exploitation. Chapter 14 examines the disinformation dimension: what happens when agents capable of generating and distributing content at machine speed are deployed without adequate controls on what they produce, retrieve, summarise, and amplify.
Chapters in This Part
| Chapter | Title | Theme |
|---|---|---|
| 12 | The Fallibility of Agents: Perception Gaps and Failure Modes | Intrinsic limitations |
| 13 | Attack Surfaces in Agentic Systems: A Security Primer | Adversarial risk |
| 14 | Disinformation at Machine Speed: How Agents Can Mislead | Information integrity |
Chapter 12 establishes the foundational risk context that Chapters 13 and 14 build on. Reading this part in order is recommended.
Building agentic AI and wondering why alignment is harder than the technology? Get in touch