I’ve been hesitant to chime in on AI agent architectures other than to say they’re not there yet. The hard takeoff of OpenClaw (an always-on AI assistant with countless integrations out of the box) and Moltbook (a social network for people’s AI assistants to chat) created a compelling public spectacle that clearly indicates growing momentum behind LLM-powered abstractions.
Always wary of getting caught up in hype, I’d like to chime in again and try to tease out some common threads behind some of the most interesting and successful experiments. In doing so, I’ll suggest a potential way forward that I’d like to explore.
Coding Agents: Claude Code
This is the daily driver of many a vibe-coder. I only mention it first because it’s used as a building block in the next section. Most “coding agents” share the basic structural features:
- System Prompt: The base “personality” of the LLM.
- Tools: Special types of messages that the LLM can send to invoke some out-of-context execution and retrieve results.
- Skills: Markdown files that can be brought into context, effectively extending the system prompt for some specific task. These are requested by the agent through tool invocation.
- Context: The sum total of all the above, plus conversation history and tool results.
Tools give the ability to traverse, edit, and compile filesystem objects. That’s basically it - let it loose in your project folder, say a prayer, and you may get some useful work done. Your context will probably grow very rapidly, leading to high costs. A wrong turn early in the context can lead to an expensive dead end that needs to be backed out and re-attempted.
My main issue with Claude Code is that the control plane is somewhat opaque:
- Can you reduce the system prompt?
- Can you characterize tool invocations and modify their implementations to use context more frugally?
- Can you understand and modify the permission scopes and easily sandbox the agent?
- Can you decide when to prune or compact the context to reduce costs?
The answer to all might be yes - especially if you adopt an open-source alternative and really dig into the tool. My point isn’t that these are limitations, but that coding agents simply haven’t been appealing enough to dig into their peculiarities. Just YOLO with off-the-shelf settings and maybe you’ll get your money’s worth.
Orchestration: Gas Town
This highly deranged and entertaining article on the Gas Town project describes experiments with assigning roles and lashing together instances of Claude Code. The author admits that Gas Town is “expensive as hell”. The math is simple: if Claude Code munches a ton of tokens, then dozens of Claude Codes munch dozens of tons of tokens.
The results are undeniably interesting. I struggle to get my head around the numerous roles and abstractions invented by the author, who seems delightfully able to come up with ideas faster than anyone can keep up with. The issue here is that the orchestration plane is highly opinionated and, as above, difficult to reorganize without really digging into the tool.
Research from Google finds that multi-agent systems aren’t always the best tool for the job - so orchestrated agents like Gas Town might never be the default.
Pi
Pi is the “Minimal Agent” powering OpenClaw. It is fundamentally similar to any other coding agent, but with a focus on starting with a slim set of core tools (Read, Write, Edit, Bash) and encouraging extension into modular components.
The core philosophy seems to be that a coding agent should know how to extend itself. This lead to a rapid bootstrapping of integrations, plus easy onboarding.
The tool has gone viral as an always-on AI assistant. While the idea is appealing, I can’t personally recommend it as the security attack surface seems way too large.
What Next?
My recent work in data modeling has centered around one idea: By starting at the modeling level, you lock yourself into architecture and tooling. An alternative approach is to work from a metamodel foundation that has introspection - the ability to self-host. This gives you flexibility, provenance, and reproducibility.
My hypothesis at this point is that a true Minimal Agent shouldn’t work with filesystem and command-line primitives. Instead, the minimal tools for bootstrapping any kind of agent architecture might be:
- defining tools
- defining agent profiles
- creating workspaces (sandboxed VMs!)
- spawning agents into workspaces
This meta-agent’s prime concepts are security, scope, and orchestration. It has a self-definition. It could end up spawning new agents that look like Gas Town in one VM, or Pi in another.
I just need a catchy project name and I’ll get a proof of concept online!