spooling
All writing
April 27, 2026 · 5 min read

Agentic Amnesia

Memory shipped in 2026. Cross-tool memory didn't. The seam between tools is where the cost lives, and it's worst for teams running more than one AI assistant.

An engineer spends forty minutes with Claude Code working through a tricky migration. They land the fix. They close the laptop. The next morning they open Cursor on a different branch, and the model has no idea what they figured out yesterday. They re-explain. They paste the same stack trace, the same schema, the same constraint about why the obvious answer is wrong. The model agrees, very confidently, and proposes the obvious answer. Meanwhile a teammate working a related PR needs that same reasoning to make their own change, but it's stuck inside a session on the first engineer's laptop, in a tool the teammate may not even use.

The market has converged on a name for this: agentic amnesia. I'll use it here for a specific cut: not amnesia inside one tool's session, but amnesia at the seams between tools.

This isn't a one-engineer problem. Multiply the anecdote above across every developer on the team, every team in the firm, every client engagement running in parallel. As a consultant, I switch between Claude Code, Codex, and Cursor depending on the SOW, the client's security protocols, and what their procurement function will accept. The developer next to me is doing the same thing on a different schedule, for a different client, with a different stack. Across a firm, those tool resets aren't isolated events. They're a constant. And the institutional knowledge each reset erases compounds team-wide: nobody can search what the engineer one desk over figured out yesterday, because it lived inside a different tool on a different laptop.

A reasonable pushback: the vendors have shipped memory. Claude Code has Auto Memory and CLAUDE.md. Claude has cross-chat memory. Claude Managed Agents got persistent memory in April. ChatGPT has memory. Cursor has Rules. claude-mem hit GitHub Trending in February and is now north of sixty thousand stars. The "stateless model" framing of 2024 is wrong now.

True, but missing the point. Every one of those memory features lives inside the tool that ships it. From Anthropic's own Claude Code docs, emphasis mine: "Auto memory is machine-local... Files are not shared across machines or cloud environments." The same shape applies to every other vendor in that list. Each tool only remembers its own sessions. None of them know about each other.

So agentic amnesia isn't about no memory existing. It's about the memory being vendor-walled by design, and the moment a developer switches tools, none of yesterday's reasoning crosses with them. The amnesia lives at the seams.

Plenty of teams already hack around this with markdown files committed to the repo: CLAUDE.md, NOTES.md, /docs/decisions/. It's better than nothing, and it isn't vendor-walled. It also only captures the calls someone remembered to write down, never the chain of thought, the dead ends, or the cost of any of it. It isn't searchable by meaning. And it's somebody's manual job to maintain, which means the moment that somebody rotates off, the file goes stale.

What it actually costs

Three vents, all measurable, none currently measured.

Time. Honest answers from developers I've talked to cluster between ten and thirty minutes per re-engagement, which matches what software-developer-specific research on interruptions has consistently found: resumption time after a context break falls in the fifteen-to-thirty-minute range. Cross-tool re-syncing tends toward the high end of that, because the work has to be reconstructed inside a fresh model that has no record of what came before, not just reloaded into the developer's head. And the underlying productivity gain is itself contested: METR's 2025 study of sixteen experienced open-source developers found they were 19% slower with AI tools while believing they had been faster.

Money. Every re-explained context is tokens. Every token is dollars. At Anthropic's 2026 list pricing, Sonnet 4.6 is $3 per million input tokens and $15 per million output. A single re-engagement that pastes a few hundred lines of code, a stack trace, and the back-and-forth needed to orient a fresh chat easily burns 25,000 input tokens and a few thousand output. Pennies on a single session, real money across a fleet:

And it isn't just engineering leaders flying blind. PMs, practice leads, partners, and finance across the firm have no visibility into which AI tools are even being used, by whom, on which client engagement. Developers expense their own subscriptions, share license seats informally, rotate tools by project, and the rollup lives nowhere. There's no AWS Cost Explorer for agentic engineering. There's just a bill, and it arrives a month after the work is done.

Knowledge. When a developer rolls off, the codebase remains. The commits remain. What doesn't remain is the trail of "we tried X, it broke because of Y, so we did Z." That reasoning lived in chat windows. Some of it now lives in CLAUDE.md or Cursor Rules. Almost none of it lives anywhere a next engineer using different tools can search. The code got cheaper to write. The institutional knowledge around it got more expensive to retain.

Why the vendors won't fix this for you

The cheaper tokens get, the more of them everyone uses. Falling unit cost is not the same as falling total spend, and it has nothing to do with the cross-tool waste, which compounds with volume regardless of price.

The numbers behind the lock-in aren't subtle. Anthropic went from roughly $1B in annualized revenue in January 2025 to roughly $30B by April 2026. Claude Code alone hit a $2.5B run rate by February.

That share is the prize. No vendor with that prize in sight is going to build a first-class import path for a competitor's sessions. The memory each tool offers is the memory of itself, by design.

The unification has to live somewhere else: in a layer that sits across every tool the team uses, owned by the team, not the model vendor. Three things that layer has to do.

Cross-tool by default. A memory that only knows Claude is just Claude with a longer context window. The point is to make the Cursor session you ran yesterday searchable from inside the Codex session you're running today. That means parsing every vendor's session format and treating "the model" as one component of a record that includes the developer, the project, the tool calls, and the timestamp.

Searchable by meaning, not by string. Nobody remembers the exact phrase they typed three weeks ago. They remember the shape of the problem. Vector search over the session content is what makes the memory useful instead of just archived.

Cost-attributable. If the system already has the full session record, attaching tokens, dollars, project, and developer to each one is essentially free. The moment cost is per-engagement, the conversation with stakeholders changes. Agentic engineering goes from a flat OpEx line to something you can charge back to the client engagement that produced it.

Where this lands hardest

The firms most exposed to agentic amnesia are mid-market consulting and engineering shops, the $50M to $200M range. Hundreds of developers, dozens of client projects, no resources to build internal tooling, and an AI stack that varies by client because procurement varies by client. The pitch in that environment isn't abstract: your devs are losing real billable time re-syncing context across tools, the next engineer rolling onto a project rediscovers everything from scratch, and you can't tell a client what slice of your AI bill came from their engagement. You should be able to.

That's not a feature pitch. It's a P&L pitch dressed up as engineering tooling, which is the only kind of pitch that closes inside a firm where partners read everything through a margin lens.

The independent ledger

The vendors will keep changing. Models will be deprecated, UIs rewritten, tools acquired or sunsetted. Whatever the dominant coding assistant is in 2028 isn't the dominant one today. None of that should erase the reasoning your team produced this year.

A memory layer is, in that sense, an independent ledger for agentic engineering. It's the part you keep when you change vendors. It's the part that lets a new hire in 2027 ask "why did we go with Postgres here" and get an answer from the engineer who's no longer at the company. It's the part that turns agentic engineering from a transaction into an asset.

Vendor-walled memory is the default. It doesn't have to be.


Notes and sources

The phrase "agentic amnesia" and adjacent phrases like "AI amnesia" have been in circulation through 2025 and 2026 as the agent space has wrestled with the memory problem. Useful entry points include MMC's Treating AI's amnesia and other disorders and Oracle's Agent Memory: Why Your AI Has Amnesia and How to Fix It. I'm using the term here for the cross-tool cut specifically.

Cited research:

  • METR. Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity, July 2025. arxiv.org/abs/2507.09089.
  • Abad et al. (Calgary and Leibniz Hannover), Task Interruption in Software Development Projects: What Makes Some Interruptions More Disruptive than Others?, ACM EASE 2018: dl.acm.org/doi/10.1145/3210459.3210471 (preprint at arxiv.org/abs/1805.05508). The developer-specific study behind the 15-to-30-minute resumption-time range, based on 4,910 recorded tasks across 17 professional developers plus a survey of 132. DX's readable 2023 summary is the best entry point. The range traces back to Gloria Mark's foundational CHI 2008 paper and has been replicated in developer contexts since.

Anthropic primary sources:

Reported revenue and market share (Anthropic ~$1B → ~$30B in 2025-2026, $2.5B Claude Code run rate by February 2026, ~32% enterprise coding share) are widely covered, e.g. SaaStr.

Other:

Time-cost figures ("ten to thirty minutes per re-engagement", "real billable time across a fleet") are field observations consistent with the published interruption research above, not survey data of my own. The mid-market firm segmentation reflects my own experience consulting inside a boutique technology consulting firm and the adjacent shops it works alongside.