A
argbe.tech - news
1min read

Deep Agents adds context compression for long-running agent runs

LangChain’s Deep Agents SDK details a three-layer approach to keep agent sessions within context limits by offloading large tool payloads and summarizing only when needed.

LangChain outlined how its open-source Deep Agents SDK keeps long-running agents productive when conversations outgrow a model’s context window.

  • Uses a filesystem-backed “scratch space” so agents can write, search, and re-read information that was removed from the active prompt.
  • Offloads oversized tool outputs once a single tool response exceeds 20,000 tokens, replacing it with a file reference plus a 10-line preview.
  • Trims older file write/edit arguments after the session crosses about 85% of the model’s context window, since the full content already exists on disk.
  • Falls back to summarization only after offloading can’t free enough space: a structured in-context summary replaces the history, while the full message log is preserved to the filesystem.
  • Includes targeted evals that force summarization mid-task and check recovery of a “needle-in-the-haystack” fact via filesystem search.