If you give an agent (Claude Code, Cursor, Codex) a tool to save observations — "save_observation: persist this insight for future sessions" — and explicitly instruct it to use the tool in system prompts, config files, everywhere you can, it calls it maybe 30% of the time.
The agent will happily use tools that help it complete the current task. But a tool that only benefits future sessions? Almost never.
My working theory: these models are optimized for task completion within the current context window. Saving an observation has zero value for the current task — it's a token cost with no immediate reward. The model has learned that every token spent on "let me save this for later" is a token not spent on the actual work. The incentive structure is wrong at the training level.
I ended up building a passive observation system that watches what the agent does and infers observations from tool calls and AST-level code diffs, without requiring agent cooperation. But I'm curious if others have found ways to make agents reliably self-document.
Has anyone solved this? Techniques like: - Prompt structures that actually get agents to save context - Fine-tuning approaches that reward knowledge retention - Alternative architectures for persistent agent memory
Or is passive observation the only reliable path when the agent won't cooperate?
The pattern that improved compliance for us was turning memory into a required finalizer step: task is only considered done after (1) artifact output and (2) a structured observation write with a fixed schema (decision, evidence, failure, next-check). If the second step is missing, a checker agent rejects completion and asks for retry.
Prompting alone stayed flaky. Gating plus an explicit verifier moved behavior from "optional hygiene" to "part of done". Passive extraction is still valuable, but as a safety net instead of the primary path.