Self-Editing Coding Agents

Context surgery as a tool.

Thomas Ptacek on 2025-09-26T16:25:41:

You can give an agent access to its own context and ask it to lobotomize itself like Eternal Sunshine. I just did that with a log ingestion agent (broad search to get the lay of the land, which eats a huge chunk of the context window, then narrow searches for weird stuff it spots, then go back and zap the big log search). I assume this is a normal approach, since someone else suggested it to me.

Thomas Ptacek on 2025-09-26T16:42:38:

This is why I'm writing my own agent code instead of using simonw's excellent tools or just using Claude; the most interesting decisions are in the structure of the LLM loop itself, not in how many random tools I can plug into it. It's an unbelievably small amount of code to get to the point of super-useful results; maybe like 1500 lines, including a TUI.

Given the wide acknowledgment that the current effective usage of coding agents involves effective management of context, I expect to see context editing functionality show up in the major editors in the near future. Cursor provides auto-generated summaries of previous chats as context that can be fed into a different agent (useful for trying to stay on task but resetting the context). Claude Code provides sub-agents, which let the current coding agent spawn a sub-agent for specific tasks, managing the top-level agent's context by keeping task-focused context out of the upper level.

Ptacek's described approach here is interesting and novel to me — both in making it a tool handed to the agent, as well as in simply developing your own agent runtime. If I were developing the feature, I'd want to see the performance of different llm models on summarization/"importance marking" tasks, because I have a hypothesis that you could get better, cheaper results, by making the context editing an out-of-band command which uses a smaller, cheaper llm model to edit the context, rather than bringing the full power of whichever model is your workhorse.

Updated 2025-09-28 at 17:0: I remembered that Cursor recently released auto-summarization to reset the context, and Claude Code has had auto-summarization (they call it auto-compaction) since mid-year. I still think there's plenty of room for innovation in this area — running compaction more intelligently and more often. For example: by targeting all build output for removal, or targeting "non-successful actions".