New cookbook: Context Engineering - Short-Term Memory Management with Sessions from OpenAI Agents SDK #2123

emre-openai · 2025-09-08T17:05:29Z

Summary

New cookbook: Context Engineering - Short-Term Memory Management with Sessions from OpenAI Agents SDK
This guide focuses on two proven context management techniques—trimming and compression—to keep agents fast, reliable, and cost-efficient.

Real-World Scenario

We’ll ground the techniques in a practical example for one of the common long-running tasks, such as:

Multi-turn Customer Service Conversations
In extended conversations about tech products—spanning both hardware and software—customers often surface multiple issues over time. The agent must stay consistent and goal-focused while retaining only the essentials rather than hauling along every past detail.

Techniques Covered

To address these challenges, we introduce two concrete approaches using OpenAI Agents SDK:

Trimming Messages – dropping older turns while keeping the last N turns.
Summarizing Messages – compressing prior exchanges into structured, shorter representations.

Motivation

This notebook is the first resource around agent memory. Many of our customers are asking about resources to solve agent memory challenges. This will be highly valuable asset for our customers who are building AI agents and working on context engineering.

For new content

When contributing new content, read through our contribution guidelines, and mark the following action items as completed:

I have added a new entry in registry.yaml (and, optionally, in authors.yaml) so that my content renders on the cookbook website.
I have conducted a self-review of my content based on the contribution guidelines:
- Relevance: This content is related to building with OpenAI technologies and is useful to others.
- Uniqueness: I have searched for related examples in the OpenAI Cookbook, and verified that my content offers new insights or unique information compared to existing documentation.
- Spelling and Grammar: I have checked for spelling or grammatical mistakes.
- Clarity: I have done a final read-through and verified that my submission is well-organized and easy to understand.
- Correctness: The information I include is correct and all of my code executes successfully.
- Completeness: I have explained everything fully, including all necessary references and citations.

We will rate each of these areas on a scale from 1 to 4, and will only accept contributions that score 3 or higher on all areas. Refer to our contribution guidelines for more details.

…penai/openai-cookbook into emre/agents-context-management

msingh-openai · 2025-09-10T17:22:48Z

A few recommendations:

In the opening " If too much is carried forward, the model risks distraction, inefficiency, or outright failure. If too little is preserved, the agent loses coherence. .." perhaps good to mention about context window size
In Step 0 also mention setup OPENAI_API_KEY
Step 1: Below we install the openai-agents library (the OpenAI Agents SDK ... missing a closing parenthesis
Its not clear from the cookbook whether context trimming and context summarization are two different techniques or two steps in the same technique? In other words can you use context trimming on its own or you always have to use the two together?

If they are two separate techniques then you should highlight what situation they are applicable (pros/cons)
If they are two steps in the same technique then update the cookbook to indicate this.

emre-openai · 2025-09-10T20:11:36Z

I added context definition and current value for the GPT5

Here, context refers to the total window of tokens (input + output) that the model can attend to at once. For GPT-5, this capacity is up to 272k input tokens and 128k output tokens but even such a large window can be overwhelmed by uncurated histories, redundant tool results, or noisy retrievals. This makes context management not just an optimization, but a necessity.

Added API KEY instructions

Before running the workflow, set your environment variables:

# Your openai key
os.environ["OPENAI_API_KEY"] = "sk-proj-..."

Alternatively, you can set your OpenAI API key for use by the agents via the set_default_openai_key function by importing agents library .

from agents import set_default_openai_key
set_default_openai_key("YOUR_API_KEY")

Fixed the typo
Two separate techniques. Added pros and cons for each with when best section to better understand each technique and make a decision about when to use it.

Real-World Scenario

We’ll ground the techniques in a practical example for one of the common long-running tasks, such as:

Multi-turn Customer Service Conversations
In extended conversations about tech products—spanning both hardware and software—customers often surface multiple issues over time. The agent must stay consistent and goal-focused while retaining only the essentials rather than hauling along every past detail.

Techniques Covered

To address these challenges, we introduce two separate concrete approaches using OpenAI Agents SDK:

Context Trimming – dropping older turns while keeping the last N turns.
- Pros
  - Deterministic & simple: No summarizer variability; easy to reason about state and to reproduce runs.
  - Zero added latency: No extra model calls to compress history.
  - Fidelity for recent work: Latest tool results, parameters, and edge cases stay verbatim—great for debugging.
  - Lower risk of “summary drift”: You never reinterpret or compress facts.
  Cons
  - Forgets long-range context abruptly: Important earlier constraints, IDs, or decisions can vanish once they scroll past N.
  - User experience “amnesia”: Agent can appear to “forget” promises or prior preferences midway through long sessions.
  - Wasted signal: Older turns may contain reusable knowledge (requirements, constraints) that gets dropped.
  - Token spikes still possible: If a recent turn includes huge tool payloads, your last-N can still blow up the context.
- Best when
  - Your tasks in the conversation is indepentent from each other with non-overlapping context that does not reuqire carrying previous details further.
  - You need predictability, easy evals, and low latency (ops automations, CRM/API actions).
  - The conversation’s useful context is local (recent steps matter far more than distant history).
Context Summarization – compressing prior messages(assistant, user, tools, etc.) into structured, shorter summaries injected into the conversation history.
- Pros
  - Retains long-range memory compactly: Past requirements, decisions, and rationales persist beyond N.
  - Smoother UX: Agent “remembers” commitments and constraints across long sessions.
  - Cost-controlled scale: One concise summary can replace hundreds of turns.
  - Searchable anchor: A single synthetic assistant message becomes a stable “state of the world so far.”
  Cons
  - Summarization loss & bias: Details can be dropped or misweighted; subtle constraints may vanish.
  - Latency & cost spikes: Each refresh adds model work (and potentially tool-trim logic).
  - Compounding errors: If a bad fact enters the summary, it can poison future behavior (“context poisoning”).
  - Observability complexity: You must log summary prompts/outputs for auditability and evals.
- Best when
  - You have use cases where your tasks needs context collected accross the flow such as planning/coaching, RAG-heavy analysis, policy Q&A.
  - You need continuity over long horizons and carry the important details further to solve related tasks.
  - Sessions exceed N turns but must preserve decisions, IDs, and constraints reliably.

Dimension	Trimming (last-N turns)	Summarizing (older → generated summary)
Latency / Cost	Lowest (no extra calls)	Higher at summary refresh points
Long-range recall	Weak (hard cut-off)	Strong (compact carry-forward)
Risk type	Context loss	Context distortion/poisoning
Observability	Simple logs	Must log summary prompts/outputs
Eval stability	High	Needs robust summary evals
Best for	Tool-heavy ops, short workflows	Analyst/concierge, long threads

biggerveggies · 2025-09-15T01:51:05Z

Sorry, this is very minor - just noted in session_memory.ipynb. --- the word 'independent' appears to be misspelled as 'indepentent' under the 'Techniques Covered' section.

emre-openai added 9 commits September 4, 2025 12:48

initial commit

94b864b

tweak

a74768c

tweak

5fef6f4

tweak

67f98f8

images added

9e1ff77

tweak

7c01dfd

tweak

412fd42

registry added

8922b5c

tweak

33f74cd

emre-openai requested a review from msingh-openai September 8, 2025 17:05

emre-openai added 3 commits September 8, 2025 10:09

Merge branch 'main' into emre/agents-context-management

4612c8e

fixing images

f750cfa

Merge branch 'emre/agents-context-management' of https://github.com/o…

6b5f373

…penai/openai-cookbook into emre/agents-context-management

updated based on the PR comment

f33da14

msingh-openai approved these changes Sep 10, 2025

View reviewed changes

emre-openai merged commit 7433ba1 into main Sep 10, 2025
1 check passed

emre-openai deleted the emre/agents-context-management branch September 10, 2025 21:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

New cookbook: Context Engineering - Short-Term Memory Management with Sessions from OpenAI Agents SDK #2123

New cookbook: Context Engineering - Short-Term Memory Management with Sessions from OpenAI Agents SDK #2123

emre-openai commented Sep 8, 2025

Uh oh!

msingh-openai commented Sep 10, 2025

Uh oh!

emre-openai commented Sep 10, 2025

Uh oh!

Uh oh!

biggerveggies commented Sep 15, 2025

Uh oh!

Uh oh!

New cookbook: Context Engineering - Short-Term Memory Management with Sessions from OpenAI Agents SDK #2123

New cookbook: Context Engineering - Short-Term Memory Management with Sessions from OpenAI Agents SDK #2123

Conversation

emre-openai commented Sep 8, 2025

Summary

Real-World Scenario

Techniques Covered

Motivation

For new content

Uh oh!

msingh-openai commented Sep 10, 2025

Uh oh!

emre-openai commented Sep 10, 2025

Real-World Scenario

Techniques Covered

Uh oh!

Uh oh!

biggerveggies commented Sep 15, 2025

Uh oh!

Uh oh!