Chapter 25 · Ethics in Practice: Responsible AI Beyond Compliance

Compliance tells you the minimum. Ethics asks what you actually want to be.

The Gap Between Compliance and Values

An organisation that meets every applicable AI regulation has not necessarily built ethical AI. It has built compliant AI — a different thing. Compliance is a floor, not a ceiling, and the floor is often lower than the design standards that organisations that care about their reputation and their relationships with the people their AI affects would choose to set.

This chapter is not about what the law requires. Chapter 24 covers that. This chapter is about what organisations that want to build AI worthy of the trust placed in it actually do differently — and specifically, how those choices show up in design decisions rather than mission statements.

Every section in this chapter is intended to land on a concrete design decision or operational practice. Ethics that does not change what you build is decoration. This practical emphasis matters because global AI ethics guidelines have converged around recurring principles — fairness, transparency, accountability, privacy, and human control — while diverging sharply on how those principles should actually be implemented.²

Fairness in Agentic Systems

AI fairness has been discussed for over a decade, primarily in the context of static machine learning models making classification decisions: does the model perform differently for different demographic groups? Barocas, Hardt, and Narayanan's treatment of fairness is especially important here because it shows that fairness is not a single metric but a family of trade-offs among error distribution, representation, measurement, and institutional context.³ The agentic context is more complex, and the fairness implications are harder to detect and correct.

An agent trained or fine-tuned on historical data inherits the biases in that data, which is why dataset documentation practices such as Datasheets for Datasets matter: they force teams to record how data was collected, what populations it represents, what it excludes, and what uses it was not designed to support.⁴ But unlike a static classifier, an agent acts on those biases repeatedly, across a wide range of tasks, over an extended period, at high volume. The cumulative effect of a small bias in how an agent handles one category of input, compounded across thousands of interactions, can produce outcomes that are materially unfair even when no individual agent decision would appear unreasonable in isolation.

Three specific fairness risks in agentic systems deserve attention:

Input handling bias. Agents may handle inputs from some groups differently based on stylistic or linguistic features that correlate with demographic characteristics — formal versus informal language, names, vocabulary choices. An agent that provides better-quality responses to queries written in formal professional English than to queries written in other registers is applying a de facto preference that may produce systematically worse outcomes for some users.

Escalation bias. Agents configured to escalate complex or uncertain cases to human review may escalate different case types at different rates for different user groups. If the escalation decision is made partly on the basis of conversational features that correlate with demographic characteristics, the users who are escalated to human review (and receive better service) may not be a demographically neutral sample.

Compounding over time. In personalised systems, agent behaviour adapts over time to patterns in user interaction. If the initial behaviour is biased, the adaptation compounds the bias: users who receive worse service interact less, the agent learns that their interaction pattern is associated with less valuable engagement, and the service differential widens.

Testing for fairness in agentic systems requires more than demographic disaggregation of performance metrics. It requires testing with inputs that systematically vary the features that might trigger differential treatment — across the full range of tasks the agent handles, not just the task types that were tested pre-deployment. The Amazon recruiting-tool episode illustrates why aggregate testing matters: the reported bias was not an obviously discriminatory rule, but a statistical pattern learned from historical hiring data and surfaced only when outcomes were examined systematically.⁷

Accountability Architecture

Chapter 24 addresses the legal dimensions of accountability — the liability frameworks, regulatory obligations, and compliance structures that the law requires. This section addresses the complementary organisational question: what does it mean for accountability to be genuine rather than nominal, and how does that show up in how the organisation is structured around its agents? Internal algorithmic auditing provides the operational bridge between these two meanings of accountability: it turns abstract responsibility into documented review practices, named owners, testable claims, and remediation loops.⁵

Genuine accountability has three properties that legal frameworks alone do not guarantee:

Identifiability. For every consequential decision an agent makes or contributes to, there is a named human who is accountable for it. Not a team, not a function, not "the system" — a named individual whose accountability is documented and whose performance evaluation reflects how well they exercise that accountability. Accountability without identifiability is diffuse enough to be meaningless.

Proportionate consequence. When something goes wrong, the accountability structure produces a proportionate response — not punishment, but a real connection between a person's accountability and their incentive to exercise genuine judgment. An accountability structure in which all agent errors are attributed to "the technology" produces no incentive for the humans overseeing agents to catch problems before they propagate.

Meaningful scope. The humans who are accountable for agent decisions must actually be able to understand, evaluate, and influence those decisions. Accountability without scope — assigning responsibility to people who lack the information or authority to affect the outcome — is not accountability. It is a liability transfer.

The practical design implication is that accountability structures for agentic AI should be built backward from these properties: start with the consequential decisions the agent makes, identify who has the information and authority to evaluate each one, and assign accountability to those people — rather than assigning accountability to whoever is organisationally convenient.

Transparency as Design

The regulatory conception of transparency is largely about disclosure: users must be told they are interacting with AI, high-risk systems must document their functioning, providers must maintain technical documentation. These are necessary requirements. They are not sufficient for the transparency that genuinely ethical AI requires.

Meaningful transparency goes beyond disclosing the existence of AI. It makes the AI's reasoning visible to the people who need to understand it — including, critically, the people whose interests the AI affects, not just the people who operate it.

Reconstructable reasoning. An agent whose outputs cannot be explained in terms of the inputs and reasoning that produced them is a system where accountability is structurally impossible. Interpretability research is relevant not because every agent decision can be made perfectly transparent, but because it clarifies the standard: explanations must support reliable human judgement about whether the system behaved appropriately, not simply satisfy curiosity.⁶ Designing for reconstructable reasoning means maintaining intermediate reasoning traces, using confidence scoring that reflects genuine calibration rather than apparent confidence, and ensuring that the audit trail described in Chapter 22 can support a meaningful explanation, not just a record that something happened.

Scope visibility. The people who interact with an agent should know, clearly and consistently, what the agent can and cannot do. This is not just a trust calibration issue (covered in Chapter 9 for customer-facing AI) — it is an ethical requirement. An agent that appears to offer general guidance but actually has a limited scope creates the conditions for reliance errors that harm the people who trust it.

Limitation acknowledgment. An agent that does not know something should say so. An agent that is operating near the edge of its reliable capability should signal that. The design of agent communication should treat confident-sounding outputs from unreliable states as a design defect, not an acceptable response to user experience preferences.

When a human being acts on behalf of another — an employee acting for their employer, a lawyer acting for a client, a doctor acting for a patient — the scope of that authority is defined, either explicitly by contract or implicitly by professional norms and legal frameworks. The person acting on behalf of another is constrained by the authority they have been given.

When an agent acts on behalf of a user, the scope of that authority is defined by the agent's configuration — a document that most users have never read and which was not negotiated with them. The agent's operator decided what the agent can do; the user agreed, in the most general terms, to use a service. The gap between what the user believed they were agreeing to and what the agent is authorised to do is frequently significant.

This gap is an ethical problem, not just a legal one. Nissenbaum's contextual integrity framework explains why: information and authority are not free-floating simply because they are technically available; they remain governed by the norms of the context in which they were given.⁸ The legal requirements for consent — a click-through agreement, a terms of service disclosure — can be met without anything resembling genuine informed consent to the specific actions the agent might take. Ethical practice requires more:

Scope disclosure at decision points. Users should be told, at the moment when an agent is about to take a significant action, what action it is about to take and what authority it is exercising. "I'm about to submit this expense claim on your behalf — does this look correct?" is a minimal but genuine consent mechanism. It is also not the default for most agentic deployments.

Action reversibility by default. Where possible, agents should take reversible actions before irreversible ones, and should flag when an irreversible action is required before taking it. The gap between authorised and desired action is most damaging when the action cannot be undone.

Genuine opt-out. Users must be able to withdraw agent authority without the withdrawal being practically difficult, commercially disadvantageous, or technically complex. An opt-out that requires a call to customer service is not genuinely available; an opt-out that is one step from the action being authorised is.

Key takeaway: Consent for agentic AI is not achieved by burying authority in terms of service. It requires, at minimum, that users understand what the agent is about to do before it does it, and can genuinely stop it from doing so.

The Ethical Treatment of the People Agents Affect

Every agentic deployment displaces some human activity. Sometimes that displacement is trivial — the agent handles administrative tasks that nobody wanted to do. But the taxonomy of language-model risks also treats economic and social harms as first-order risks, not side effects outside the AI system's ethical perimeter.¹ Sometimes it is significant — the agent handles work that constituted the majority of a team's role. In both cases, the organisation deploying the agent is making a choice that affects real people, and that choice has ethical dimensions that a business case does not capture.

This is not an argument against deployment. It is an argument that the ethical dimensions of the workforce effects of agentic AI deserve explicit consideration in deployment planning rather than being treated as a consequence to be managed after the fact.

Organisations that address this well do three things:

Acknowledge the effects honestly. They do not describe workforce automation as "freeing people to do higher-value work" when the honest description is that some roles will no longer be needed. Describing displacement as liberation is both inaccurate and disrespectful to the people affected.

Plan workforce transitions actively. Where agent deployment will reduce headcount, transition planning — retraining, redeployment, severance, and outplacement support — should be planned before deployment, not after. The people who will be affected deserve that consideration.

Involve affected workers in deployment design. Workers whose roles will change significantly because of agent deployment have knowledge that is directly relevant to deployment quality — they know the edge cases, the exception handling, the informal processes that documentation does not capture. Their involvement in deployment design is both ethically appropriate and practically valuable.

Practical Ethics Governance

Ethics governance in most organisations exists in one of two forms: as a policy document that nobody reads, or as a central committee that reviews AI projects and produces recommendations that are rarely binding.

Neither form produces the continuous, embedded ethical review that agentic AI requires. The alternative is to build ethics review into the operational processes that govern agentic deployments — making it a regular activity with real authority, not a periodic event with advisory status.

The components of effective ethics governance for agentic AI:

Embedded review at design gates. Each stage transition in the maturity framework from Chapter 16 should include an explicit ethical review: does the agent's scope, data access, and action authority at this level of autonomy match the ethical principles the organisation has committed to? This review should be conducted by someone with both ethical expertise and operational authority — not outsourced to a committee that meets quarterly.

Ongoing monitoring for ethical drift. Ethical issues in agentic systems do not announce themselves. Bias that compounds over time, scope creep that gradually expands the agent's authority, fairness metrics that trend in the wrong direction — these require monitoring on the same cadence as operational metrics. Ethics metrics should sit alongside operational metrics in the agent health reviews described in Chapter 21.

Clear escalation for ethical concerns. The people operating agents should have a clear, safe channel for raising ethical concerns — about agent behaviour they observe, about deployment decisions they disagree with, about outcomes they find troubling. Organisations that do not provide this channel will find that ethical concerns are raised outside the organisation rather than within it.

An Ethics Operating Model

Ethical practice becomes real when it is translated into operating routines. A responsible agentic AI programme should therefore maintain a small set of recurring artefacts:

Artefact	Purpose
Fairness test plan	Defines which groups, inputs, dialects, registers, and edge cases will be tested
Dataset record	Documents provenance, representativeness, exclusions, and intended use
Accountability map	Names the human owner for each consequential decision type
Transparency standard	Specifies what affected users are told, when, and in what form
Consent and authority log	Records which actions the agent is authorised to take and how users can withdraw that authority
Ethics incident register	Captures fairness, consent, transparency, or workforce-impact concerns even when no law was violated

The point is not to create more bureaucracy. It is to ensure that values survive operational pressure. A principle that is not assigned to an artefact, owner, and cadence will disappear when deadlines tighten.

Key takeaway: Responsible AI beyond compliance requires artefacts, owners, and review rhythms. Without them, ethics remains a statement of intent rather than a property of the system.

References

Weidinger, L., Uesato, J., Rauh, M., Griffin, C., Huang, P., Mellor, J., Glaese, A., Cheng, M., Balle, B., Kasirzadeh, A., Biles, C., Brown, S., Kenton, Z., Hawkins, W., Stepleton, T., Birhane, A., Hendricks, L.A., Rimell, L., Isaac, W., Haas, J., Legassick, S., Irving, G., & Gabriel, I. (2022). Taxonomy of Risks posed by Language Models. FAccT '22: Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, pp. 214–229. ACM.
Jobin, A., Ienca, M., & Vayena, E. (2019). The Global Landscape of AI Ethics Guidelines. Nature Machine Intelligence, 1(9), 389–399. https://doi.org/10.1038/s42256-019-0088-2
Barocas, S., Hardt, M., & Narayanan, A. (2023). Fairness and Machine Learning: Limitations and Opportunities. MIT Press. Available at fairmlbook.org.
Gebru, T., Morgenstern, J., Vecchione, B., Vaughan, J.W., Wallach, H., Daumé, H., & Crawford, K. (2021). Datasheets for Datasets. Communications of the ACM, 64(12), 86–92. https://doi.org/10.1145/3458723
Raji, I.D., Smart, A., White, R.N., Mitchell, M., Gebru, T., Hutchinson, B., Smith-Loud, J., Theron, D., & Barnes, P. (2020). Closing the AI Accountability Gap: Defining an End-to-End Framework for Internal Algorithmic Auditing. FAccT '20. ACM.
Doshi-Velez, F. & Kim, B. (2017). Towards a Rigorous Science of Interpretable Machine Learning. arXiv:1702.08608. Harvard University.
Dastin, J. (2018). Amazon scraps secret AI recruiting tool that showed bias against women. Reuters, October 10, 2018.
Nissenbaum, H. (2004). Privacy as Contextual Integrity. Washington Law Review, 79(1), 119–157.

Building agentic AI and wondering why alignment is harder than the technology? Get in touch

The Gap Between Compliance and Values​

Fairness in Agentic Systems​

Accountability Architecture​

Transparency as Design​

Consent and the Autonomous Action Problem​

The Ethical Treatment of the People Agents Affect​

Practical Ethics Governance​

An Ethics Operating Model​

References​