All posts
ResearchFebruary 20, 2026· 10 min read

Memory Strategies for Long-Running Agents

Short-term buffers, episodic memory, and semantic retrieval — understanding which memory architecture fits which agent use case.

R
Research
Liya Research

One of the hardest problems in production agent systems is memory. How much context should an agent carry? How far back should it remember? When should it retrieve versus recall from the current context window?

We've spent months running agents in production across different use cases, and we've converged on three distinct memory architectures, each suited to different task profiles.

Short-term buffer

The simplest architecture: the agent maintains a rolling window of the last N turns or tokens, and that window is included in every model call. This works well for single-session tasks where the user is continuously active. The risk is context bloat — long sessions degrade performance as the buffer fills with low-signal history.

Episodic memory

For tasks that span multiple sessions or require long-horizon reasoning, we use episodic memory. After each session, the agent generates a structured summary — entities encountered, decisions made, open questions — and stores it in a memory store. On the next session, relevant episodes are retrieved and injected into context.

Episodic memory is what lets a recruiting agent remember that a candidate mentioned relocation concerns three weeks ago — without that context living permanently in the prompt.

Semantic retrieval

For knowledge-dense tasks where the agent needs to reason over large document collections, semantic retrieval is the right architecture. Rather than pre-loading context, the agent issues retrieval queries mid-task and receives relevant passages on demand. This is the RAG pattern, but applied inside the agent loop rather than as a pre-processing step.

Combining architectures

In practice, complex agents often need all three. A career coaching agent might use a short-term buffer for the current conversation, episodic memory to recall past sessions with the user, and semantic retrieval to pull in relevant job market data or career framework documentation.