Types of AI Agent Memory: Short, Long, Episodic

Updated May 2026
AI agents use several distinct types of memory that differ in how long they last and what they hold. Short-term, or working, memory is the information in the model's context window during a single session, while long-term memory is the durable store that survives across sessions. Long-term memory is itself usually divided into three kinds borrowed from human cognition: semantic memory for facts and knowledge, episodic memory for specific past experiences and events, and procedural memory for learned skills and routines. A capable agent combines all of these, using working memory to reason about the task in front of it and long-term memory to carry knowledge, experiences, and skills forward across the gaps between conversations.

The Two-Level Foundation: Short-Term and Long-Term

The most fundamental division in agent memory is between short-term and long-term storage, and every other distinction sits underneath these two. Short-term memory, also called working memory, is simply the contents of the model's context window during the current request. It includes the system instructions, the conversation so far, the current task, and anything the memory system has retrieved and injected. It is immediately available because it is already in the prompt, it requires no lookup, and it is the space in which the model actually reasons. Its defining limitation is that it is temporary: when the session ends, or when older content scrolls out of the window to make room for new content, that information is gone.

Long-term memory is everything stored outside the context window in durable storage that persists across sessions, restarts, and time. It is what lets an agent recall a conversation from last week, a preference you stated a month ago, or a correction it received yesterday. Because it lives outside the model, long-term memory must be actively written to and retrieved from: the agent decides what is worth keeping, stores it in a database or vector store, and searches that store on future tasks to pull the relevant pieces back into working memory. This write-and-retrieve cycle is the heart of how memory works, and it is covered in depth in how memory systems work.

The relationship between the two levels mirrors the relationship between a computer's RAM and its disk. Working memory is fast, limited, and volatile, like RAM. Long-term memory is larger, slower to access, and durable, like disk. Just as a program loads the data it needs from disk into RAM to work on it, an agent retrieves the memories it needs from long-term storage into its context window to reason about them. Understanding this analogy makes the entire architecture of agent memory intuitive: the art is deciding what to keep in the fast, limited space and what to leave in the large, durable one until it is needed.

Semantic Memory: What the Agent Knows

Semantic memory holds facts and general knowledge independent of when or how they were learned. For an agent, semantic memory is the store of stable truths about its world: that a particular customer is on the enterprise plan, that a company's refund window is thirty days, that a codebase is written in a specific language, or that a user prefers concise answers. These are facts that remain true across many sessions and that the agent should be able to recall whenever they become relevant, without needing to remember the specific conversation in which it first learned them.

What makes a memory semantic is that the fact has been abstracted away from its origin. You know that water boils at one hundred degrees Celsius without recalling the lesson where you learned it, and an agent's semantic memory works the same way. This abstraction is powerful because it makes facts easy to retrieve and combine, but it also means semantic memory is where contradictions and staleness cause the most trouble. If a fact changes, such as a customer upgrading their plan, the old fact must be updated or it will keep surfacing. Maintaining semantic memory therefore depends heavily on conflict resolution and deduplication, the upkeep practices described in memory consolidation.

Episodic Memory: What the Agent Has Experienced

Episodic memory holds specific past experiences as discrete events, tied to a particular time and context. Where semantic memory knows that a fact is true, episodic memory remembers that a specific thing happened: the conversation you had on Tuesday, the task the agent completed last month, the error it hit and the fix that resolved it. Each episodic memory is a record of an event, usually timestamped, that the agent can retrieve to reconstruct what occurred and learn from it.

Episodic memory is what gives an agent continuity and the ability to learn from its own history. When an agent recalls that a particular approach failed last time and chooses a different one, it is drawing on episodic memory. Because episodes are tied to time, they are typically stored as a log that can be searched both by relevance, finding the episodes most similar to the current situation, and by recency, favoring recent events over older ones. Over time, many similar episodes are often consolidated into semantic knowledge: after resolving the same kind of issue a dozen times, the agent can distill the pattern into a general fact, which is exactly how repeated experience turns into durable knowledge.

Procedural Memory: What the Agent Knows How to Do

Procedural memory holds learned skills and routines, the knowledge of how to perform a task rather than facts about the world or records of events. In humans, procedural memory is what lets you ride a bicycle without consciously recalling each movement. In an agent, procedural memory captures reliable sequences of actions: the steps that resolve a particular class of support ticket, the chain of tool calls that reliably completes a workflow, or the format a certain kind of report should follow. It is the memory of method.

Procedural knowledge often lives in a different place from semantic and episodic memory. Some of it is encoded directly in the agent's system prompt as instructions and examples, some lives in a library of reusable routines or tools the agent can invoke, and some is baked into the model's weights through fine-tuning when a procedure is stable enough to be worth making permanent. The boundary between procedural memory held in context and procedural knowledge trained into the model is exactly the boundary between runtime adaptation and training, a distinction explored in how agents learn over time. The practical point is that knowing how to do something is its own kind of memory, distinct from knowing that something is true.

How the Types Work Together

These types are not competing options; a capable agent uses all of them at once, each handling the kind of information it suits best. Consider a support agent resolving a billing question. Working memory holds the current conversation. Semantic memory supplies the stable facts: this customer's plan, the relevant policy, the product details. Episodic memory recalls that this same customer raised a related issue last month and how it was resolved. Procedural memory provides the reliable sequence of steps for handling a billing dispute. The response the agent produces draws on every layer simultaneously, and the quality of the result depends on each layer holding the right information.

The types also feed into one another over time. Episodic memories of many similar interactions consolidate into semantic facts. Repeated successful procedures can be promoted from context into the model through training. Semantic facts retrieved into working memory shape the reasoning of the current turn. This flow, from specific experience to general knowledge to ingrained skill, is the same progression that turns a novice into an expert, and designing an agent's memory means designing the pathways along which information moves between these levels.

Choosing Which Types Your Agent Needs

Not every agent needs every type of memory, and matching the memory design to the use case keeps the system simple and effective. A stateless tool that answers one-off questions may need no long-term memory at all, relying entirely on its context window. A personal assistant needs strong semantic memory for preferences and episodic memory for shared history. An agent that performs complex multi-step workflows benefits most from procedural memory. Adding memory types an application does not need only adds cost, latency, and opportunities for error, so the discipline is to implement the minimum set that the use case genuinely requires.

A useful way to decide is to ask what the agent must carry across sessions for it to feel coherent and competent. If it must remember facts about users or its domain, it needs semantic memory. If it must recall and learn from specific past interactions, it needs episodic memory. If it must reliably repeat complex procedures, it needs procedural memory. Most production agents end up with semantic and episodic memory as the core, since those deliver the largest improvement in how personal and consistent the agent feels, with procedural memory added when the tasks are complex enough to justify it. The retrieval techniques that surface all of these are covered in memory retrieval strategies.

Key Takeaway

AI agent memory divides into short-term working memory, held in the context window, and long-term memory, which persists across sessions and splits into semantic memory for facts, episodic memory for experiences, and procedural memory for skills. A strong agent combines all the types it needs, letting specific experiences consolidate into general knowledge and proven procedures harden into ingrained skill.