Agent Memory Governance: What Your AI Agent Remembers and Who Controls It

Most governance conversations about AI agents begin with what the agent can do — what tools it can access, what tasks it can execute, what data it can touch. Very few ask what it remembers, who controls that memory, and what happens when that memory is wrong, outdated, or deliberately manipulated.

In most deployments, that layer accumulates risk quietly — across sessions, across decisions, across time — in an area most departments neither monitor nor own.

This article covers four memory types — episodic, procedural, semantic, and working — the governance risk each introduces, and what organisations in regulated industries should do about it.

Agent memory is primarily discussed as an engineering topic: how do you build a system that retains useful context across sessions? That framing is correct, but it is incomplete. Memory is also a governance boundary — and for risk managers operating in regulated industries, it is one that has received almost no serious attention.

The Control Boundary Stops Too Early

Governance for agentic systems is typically structured around three questions: what can the agent access, what can it execute, and what outputs does it produce. These are necessary controls. But they stop at the point of interaction.

What sits beyond that boundary is the memory layer — what persists across sessions, how that memory is formed, and who can modify it. In many deployments, that layer operates without explicit control design. A stateful agent accumulates. It adapts. Its behaviour at any given moment is partially a function of everything it encountered across prior sessions — and that accumulated state is a persistent attack surface and an audit challenge that most organisations are not equipped to manage.

The result is a gap: a system that is technically secured and functionally correct, but influenced by accumulated context that has never been reviewed. Many agents already have memory, by design or by accident. What most AI risk programs have not yet built are controls that account for that memory — its contents, its integrity, its lifecycle, and who has authority over it.

Four Memory Types, Four Governance Problems

Memory Has Data Governance Obligations

GDPR Articles 5, 25, and 32 do not disappear because the data subject is now being processed by an AI agent rather than a human. Purpose limitation, storage minimisation, and data integrity obligations apply to whatever the agent retains about users and interactions.

Article 5(1)(e) requires that personal data not be kept longer than necessary for the purpose. An agent that indefinitely retains episodic memory of user interactions is almost certainly in breach of this principle unless retention has been explicitly governed.

FINMA Guidance 08/2024 requires documented oversight of system behaviour and decision processes. Behaviour driven by accumulated memory that has never been audited is not documented oversight — it is undocumented drift.

A practical way to understand this layer is through four distinct memory types. Each introduces a different control challenge.

Episodic memory is the agent's record of past interactions — conversation history, timestamps, what happened and when. In many implementations, this record is stored as vectors and retrieved through the same RAG pipeline used for semantic knowledge — which means the boundary between what the agent "remembers" and what it "knows" is often a single retrieval system, not two separate layers. This record is often partially written and selectively retrieved by the agent itself. If it can be manipulated or filtered, behaviour can be influenced in ways that are invisible to standard monitoring. An attacker who can write to the episodic store can effectively shape the agent's perceived history without touching the model.

Procedural memory encodes the agent's operating routines — the workflows, standard operating procedures, and decision rules that govern how it executes tasks. In practice, this logic lives in system prompts, orchestration layers, and instruction stores. What makes this layer particularly difficult to govern is that procedural memory can also evolve through use: in more advanced agent architectures, the system derives and stores new operating rules from its own experience — a process called memory consolidation. This means the rules an agent follows may no longer match the rules it was originally given, without anyone having authorised the change. Whether the source of change is an attacker modifying an instruction store or the agent consolidating new procedures from past interactions, the governance requirement is the same: versioning, approval workflows, and the ability to audit what changed and when.

Semantic memory holds the agent's general and domain-specific knowledge — facts, associations, institutional knowledge — typically managed through retrieval systems and knowledge graphs. This is the layer where corrupting the knowledge base becomes a concrete threat. Inserting incorrect information does not produce an obvious error; it produces subtly wrong outputs that may not surface until the damage is done. In financial services, legal, or healthcare contexts, the consequences of systematically incorrect semantic memory are not theoretical.

Working memory is the agent's active context window — what the model holds and reasons over during a single task. The content of that window is assembled by the orchestration layer, which pulls from episodic, procedural, and semantic memory and composes them into the prompt the model receives. This distinction matters for governance: working memory itself is not a persistent store, but it is where the integrity of every other memory layer is either preserved or lost. If upstream sources are compromised, the orchestrator assembles a corrupted context and passes it to a model that has no way to detect the difference. Prompt injection operates at exactly this point — it exploits the moment of assembly to override trusted instructions with malicious ones, because to the model, both arrive in the same context window and both appear valid.

None of these require a model failure. The system behaves as designed — even when operating on context that may be compromised.

How This Fails in Practice

These are not theoretical edge cases. They follow recognisable patterns in organisations that have deployed memory-enabled agents without a corresponding governance layer.

A knowledge base is updated with inaccurate or biased content. Retrieval continues to function correctly, but outputs degrade over time — and because the system is technically operational, no control flags the problem. A prompt or orchestration rule is modified in a production workflow without formal approval. Behaviour changes, but there is no versioning, no audit trail, and no notification to the risk function.

A malicious input enters the working context and overrides trusted instructions. The system complies, because the context it receives appears valid. Past interactions are retained and reused without periodic review. Historical bias accumulates and begins to influence future decisions — not through a discrete failure event, but through gradual drift that becomes visible only when the consequences are already material.

In each case, the system is operating correctly. The failure is in the context it is operating on.

The Lifecycle Problem RAG Did Not Solve

Retrieval-Augmented Generation became the standard approach for grounding agents in organisational knowledge, and it works well for what it was designed to do: retrieve relevant information at inference time. But RAG is a retrieval mechanism, not a memory management system. It does not handle the memory lifecycle — and that distinction matters considerably for governance.

Effective memory management requires decisions about what to update when new information conflicts with existing knowledge, what to consolidate when redundant information accumulates, and — critically — what to forget when retention is no longer appropriate or permitted. RAG systems, in their standard implementations, do not make those decisions. They retrieve from whatever the vector store contains, regardless of whether that content is current, accurate, or still authorised to be retained.

Selective forgetting is also a security control, not only a compliance obligation. When a memory entry is identified as compromised — through a poisoned knowledge base update, a manipulated interaction record, or an outdated instruction — the ability to surgically remove that entry and verify its removal is a remediation capability. Organisations that treat memory deletion purely as a GDPR retention exercise miss this entirely: an agent operating on a corrupted memory store needs active remediation, not just a scheduled purge cycle.

Where personal data is retained, GDPR's storage limitation principle requires that it be kept only as long as necessary for its purpose. An agent that indefinitely accumulates interaction history and retrieves it at will — without any governance layer over retention, accuracy, or consent — has accumulated obligations its operators may not have noticed.

Most organisations can describe what their agent is permitted to do. Fewer can describe what it is permitted to remember — and fewer still have defined who holds authority over that memory.
— AI Resilience Lab · May 2026

Authority over memory includes two obligations that most programs have not yet operationalised. The first is scheduled deletion: memory that has served its purpose should be removed on a defined cycle, not retained indefinitely because no one has decided otherwise. The second is ownership: agent memory requires a named risk owner — a role with documented accountability for what the memory contains, how long it is kept, and what happens when it needs to be corrected or purged. Without both, the memory layer operates outside the control environment regardless of how well the rest of the system is governed.

The Ownership Gap

One reason this problem persists is structural. Responsibility for the memory layer is typically fragmented across roles that were not designed to coordinate on it: engineering builds the memory mechanisms, data teams manage storage, security controls access, and governance reviews outputs.

No single role owns how memory shapes behaviour over time. That gap matters — because once decisions depend on memory, the question is no longer whether the system is secure. It becomes whether the context influencing this decision is valid, controlled, and reviewed.

Assigning that accountability is an organisational task, not a technical one. It requires a named risk owner — documented in the AI risk register with clear responsibility for memory contents, retention schedules, and remediation — and it requires the AI risk function to extend its scope beyond system boundaries to include the persistent state those systems accumulate.

What Risk Managers Should Be Asking Now

Agent memory is not yet a standard category in AI risk inventories, model risk assessments, or third-party AI vendor reviews. The frameworks that do address it — NIST AI RMF, ISO 42001, the EU AI Act's Article 9 risk management requirements — do not name memory architecture explicitly. That is a gap the field will close, but it will close slowly. Risk managers who wait for the frameworks to catch up will be behind.

A practical starting point is to treat agent memory as a data asset and apply the same questions your organisation would apply to any other persistent data store. Who owns it? What does it contain? How long is it retained? Who has read and write access? What are the controls against unauthorised modification? When was it last audited? Is there a mechanism to correct errors or delete entries when required?

Beyond the data asset framing, there are agentic-specific questions that standard data governance does not yet cover. If the agent's procedural memory — its operating instructions — can be modified, who authorises those changes and how are they logged? If semantic memory is updated via external retrieval, what validates the accuracy of the incoming information before it is stored? If episodic memory influences future behaviour, is that influence traceable back to specific past events?

What Needs to Change

Memory needs to be treated as part of the control environment, not as a passive data layer that sits beneath the governance boundary. At minimum, that requires three things.

Integrity. Control over who can write to and modify memory stores — with versioning, approval workflows, and audit trails equivalent to those applied to any other system configuration. If a system prompt, orchestration rule, or knowledge base entry can be changed without a record, the agent's behaviour can change without accountability.

Design. Clear definition of what is stored, for how long, and for what purpose — applied at the architecture stage rather than retrofitted after deployment. This includes retention schedules, data minimisation decisions, and explicit scoping of what each memory layer is permitted to retain.

Oversight. Ongoing visibility into how memory influences decisions over time. Behavioural monitoring that extends beyond outputs to include the contextual inputs that shaped them. Without this, drift is undetectable until its consequences are already material.

Without these, governance stops at the system boundary — while behaviour continues to evolve beyond it.

The Governance Boundary Nobody Has Drawn

The broader insight from the engineering literature on agent memory is that building a reliable agent requires treating memory as an active system — something that must be maintained, governed, and occasionally corrected, not simply accumulated. That is a principle risk managers should recognise immediately, because it describes exactly what responsible data stewardship has always looked like.

The failure mode in agentic deployments is the assumption that memory is an internal implementation detail, owned by the development team, outside the scope of the AI risk function. That assumption was reasonable when agents were stateless and bounded. It does not hold for agents that accumulate state across thousands of interactions, adapt their behaviour based on that state, and operate in regulated contexts where the integrity and provenance of that state has direct legal and operational significance.

The governance boundary needs to be drawn explicitly — around the memory stores, the retrieval mechanisms, the retention schedules, and the modification controls. Organisations that draw it now will be in a substantially better position when regulators begin asking the same questions.

If your agent recalls a client interaction from six months ago and that memory influences today's decision — who has actually reviewed what it remembers?