I Killed OpenClaw and Built ClaudeClaw Mission Control

Retiring OpenClaw, migrating to ClaudeClaw Mission Control, and what five days of teardown taught me about operational blindness.

May 02, 2026

Article voiceover

0:00

-16:52

Two months ago I wrote about ripping Notion out of my workflow and replacing it with OpenClaw—a self-hosted AI agent framework running on my Mac Studio. No cloud. No subscription. No black box.

Last weekend I shut it down. Disabled 38 cron jobs. Moved 23 LaunchAgents into a _retired-openclaw/ quarantine folder. Killed the Ollama daemon. Archived the directory with a 30-day deletion timer.

Everything in that original article still reads as true. Local-first is still right. Data ownership is still right. The critique of SaaS “well-enough” software is still right. What I got wrong was believing OpenClaw was the right vehicle for any of it.

This is the post-mortem and the replacement: an agent OS I built on top of the Claude Agent SDK called ClaudeClaw Mission Control. Thirteen themed agents. One daemon. A scheduler I can actually see into. Zero silent failures slipping past me for a week before I notice.

POST-MORTEM 2026-05-02

Let me explain how I got here.

The Setup

OpenClaw was doing real work. 38 cron jobs. Morning briefings. Evening summaries. A content pipeline that pulled research from web sources, structured it, scored it, and queued articles for ASTGL. An email triage pass. A model-usage monitor. A nerve-health monitor watching the other monitors.

On paper: impressive. In practice: I had no idea if any of it was working.

The system was so noisy that when something broke, I learned about it four days later when I noticed my morning briefing hadn’t arrived. Or I didn’t learn about it at all, because the cron job was exiting 0 while the script inside it was crash-looping.

That last one is the killer. Let me show you what I mean.

What’s Actually Going On

Three failure modes hit me in a 48-hour window, and each one was invisible to the system watching the system.

Failure one: successful exits, 100% broken payload. My content pipeline was ingesting URLs, and a regression introduced a trailing-slash bug that made example.com/foo and example.com/foo/ look like different URLs to the dedup layer. Every new article hit a UNIQUE constraint violation inside a subprocess. The outer wrapper caught the error, logged it to a file nobody was reading, and exited 0. For two weeks the cron appeared green while 100% of structurings were crashing.

Failure two: PATH-resolved Node. I had the daemon running Node 24 (absolute path, explicit). A subagent it spawned inherited a PATH that fell through to Homebrew’s Node 25. One of the native modules (better-sqlite3) was compiled against 24, so every subagent invocation crashed with ERR_DLOPEN_FAILED and MODULE_VERSION mismatch. The smoke test I’d written passed because it ran from the daemon’s shell. The actual production path failed every time.

Failure three: auth expiry with no escape hatch. OpenClaw stored some credentials in pass (the Unix password store). When my GPG key timed out, the daemon couldn’t start. Which meant the health monitor couldn’t start. Which meant the thing that would have told me about the outage was the thing that was out. OpenClaw had no watcher that lived outside the daemon it was watching.

None of these are OpenClaw-specific bugs in the upstream sense. They’re pattern problems that emerge anywhere you have: 1. A monolithic daemon responsible for its own monitoring. 2. Flat-file state (HEARTBEAT.md, LEARNINGS.md) that gets appended to rather than queried. 3. Exit codes treated as truth when the real signal is in stderr. 4. No separation between “Did it run?” and “Did it work?”

OpenClaw was built for a different job. It was a personal automation gateway—great at “kick off this script at 6:30 AM.” It wasn’t built to be an agent OS with observability. I was using a shovel to drive screws.

I also couldn’t ignore the security posture. February’s disclosures—135,000 exposed instances, 15,000 vulnerable to RCE, the ClawHavoc plugin-registry incident, nine CVEs—had pushed me to patch hard and lock down. But every week I spent hardening OpenClaw was a week I wasn’t building what I actually wanted: themed agents that owned workstreams, could be reasoned about individually, and fail loudly.

The Fix

ClaudeClaw Mission Control is a Node.js daemon built on the Claude Agent SDK. It runs as a single LaunchAgent (com.claudeclaw.app), owns a SQLite store at store/claudeclaw.db, polls a scheduled_tasks table every 60 seconds, and dispatches due tasks to agents by ID.

The interesting part isn’t the daemon. It’s the agents.

I set up thirteen of them, themed after the small council of a certain fictional kingdom, because if I’m going to stare at this UI every day, I’d rather it amused me.

The War Room

Thirteen themed agents, each owning a workstream. STEWARD drives my mornings and evenings. MAESTER runs the ASTGL content pipeline. WATCHMAN watches the whole system from outside it.

Each agent lives in its own directory at agents/<id>/, with an agent.yaml (model, personality, cwd, MCP servers) and a CLAUDE.md system prompt. A scheduled task carries an agentId column in the DB, and the dispatcher routes like this:

if (shouldRouteViaAgent(task.agentId, listAgentIds())) {
  const result = await delegateToAgent(task.agentId, task.prompt, {
    fromAgent: SCHEDULER_FROM_AGENT,
    chatId: task.chatId,
  });
  return result.text ?? '(empty response)';
}

Adding a new agent is now: drop a folder under agents/, write a CLAUDE.md, run schedule reassign <task-id> <agent-id>. No source changes. The dispatcher picks it up on next tick.

That’s the piece I kept trying and failing to get with OpenClaw—modular ownership. In OpenClaw, everything was “the daemon.” In ClaudeClaw, MAESTER owning the content pipeline means if content alerts stop firing, the log line says maester: task failed instead of openclaw-gateway: subprocess exited nonzero. Attribution is free.

Adding a new agent is now: drop a folder under agents/, write a CLAUDE.md, run schedule reassign <task-id> <agent-id>. No source changes. The dispatcher picks it up on next tick.

The Watchman probes

WATCHMAN runs every hour at :05. It has seven probes, each targeting a failure mode that burned me on OpenClaw:

1. Failed tasks. status=’failed’ in the DB. Trivial.

2. Stuck tasks. status=’running’ AND last_run < now - 10min. This catches hangs.

3. Missed slots. status=’active’ AND next_run < now - 60s. Catches scheduler drift.

4. Daemon liveness. launchctl print gui/$UID/com.claudeclaw.app—does launchd still have it?

5. Content-pipeline health. Tails the structured log file, parses the JSON, checks for crash shapes.

6. Hidden failures. Scans the last_result text column for ERR_DLOPEN_FAILED, MODULE_VERSION, Traceback, and other “the job exited zero but it sure didn’t work” signals. This is the probe that would have caught my trailing-slash bug in an hour instead of two weeks.

7. Delegation crashes. inter_agent_tasks WHERE status=’failed’ — on-demand agent invocations that blew up.

On top of that, there’s a separate LaunchAgent running a healthcheck every 30 minutes that lives outside the main daemon and uses a keychain-backed Telegram token. If the daemon is dead, the healthcheck still delivers the alert. That’s the lesson from failure three: the watcher cannot share fate with the watched.

Memory v2

OpenClaw’s memory was HEARTBEAT.md and LEARNINGS.md—flat files I appended to. Eventually they got long enough that the agent stopped reading them usefully, and I had no query surface to pull just the relevant bits.

ClaudeClaw’s Memory v2 is a five-layer context stack: 1. Semantic recall—cosine similarity against stored memory embeddings, top 5 by score, chat-scoped. 2. Recent high-importance memories—memories with importance >= 0.7 written in the last 7 days. 3. Consolidation insights—a 30-minute loop that summarizes the short-term buffer into durable notes. 4. Cross-agent hive—stubbed for now; eventually lets MAESTER peek at something STEWARD noted this morning. 5. Conversation history—last N turns.

Layers dedupe by memory ID. The whole thing is safe to drop into the SDK’s systemPrompt option. It’s not magic. It’s just queryable instead of append-only, which is the delta between “context I can use” and “a log file I’ll never re-read.”

Forum-topic routing instead of bot-per-agent

A small but satisfying piece. All thirteen agents post to one Telegram bot, into one supergroup, but each agent has a dedicated forum topic:

Alerts → thread 22 (WATCHMAN)

ASTGL → thread 23 (MAESTER)

Council → thread 24

Steward → thread 25

Whisperers → thread 26

War Room - Security → thread 40 (WAR)

One token. One chat. Threaded conversations per domain. The ergonomics are dramatically better than 13 separate bots with 13 separate tokens, which is the architecture I almost built before I remembered that Telegram supergroups have forum topics now.

Why This Matters

A few things I want to flag for anyone planning something similar.

Build the rollback before you build the new thing. I wrote scripts/retire-openclaw.sh with explicit --rollback semantics before I disabled a single cron job. Plists get moved (not deleted) into _retired-openclaw/. Cron jobs get flipped enabled: false with a timestamped backup (jobs.json.bak.pre-retire-20260419). The OpenClaw directory sits untouched for 30 days with a calendar reminder to delete it. If ClaudeClaw had cratered on day two, I was one shell command away from being back on the old system in under a minute.

Silent success is worse than loud failure. The design principle I pulled from this whole experience: every job in the system needs someone whose job it is to doubt that job ran correctly. That’s WATCHMAN. That’s the external healthcheck. That’s probe #6 specifically scanning success logs for crash text. If your system can tell you “everything’s green” without that green being adversarially checked, the green doesn’t mean anything.

Themed agents beat generic workers. This one I didn’t expect. Giving each workstream a named agent with its own CLAUDE.md persona made the system more debuggable, not less—because now when STEWARD’s morning briefing has weird tone issues, I know exactly which file to edit, and I’m not risking regressions in seven other jobs that would have shared a single “universal assistant” prompt. The theme is cosmetic. The isolation is load-bearing.

Share As The Geek Learns

The Claude Agent SDK is the right abstraction for this. I spent a while trying to decide whether to keep hacking on OpenClaw, fork it, or start over. Starting over was the right call specifically because the Agent SDK handles the parts I was getting wrong: sub-agent dispatch, MCP tool wiring, system-prompt composition, retry on transient errors. I wrote the parts that are mine (the scheduler, the memory stack, the Telegram layer, the agent router) and let the SDK own the parts that are undifferentiated heavy lifting.

What I gave up. Ollama. Local models. Full offline operation. ClaudeClaw talks to Anthropic’s API, and that’s a real philosophical loss versus the local-first thing I was doing with OpenClaw. I thought about this a lot. The honest answer is that Claude Opus is enough better at long-context agentic work than anything I could run locally that the tradeoff pays for itself. I still own my data—every memory, every document, every log is on my SSD. I just don’t own the weights. For this phase, that’s the right trade.

What I kept. The philosophy. Every document is a file I can grep. Every config is version-controlled. Every decision has a session note I can link to in a future article. The system is mine to read, mine to modify, mine to understand. The whole reason I left Notion is still the whole reason I left Notion.

Quick Reference

The migration, by the numbers: - 5 days—start of retirement to all 13 agents live (2026-04-19 → 2026-04-21) - 30+ PRs—one atomic change per commit, conventional-commit format - 38 cron jobs disabled, 23 LaunchAgents quarantined - 13 agents onboarded, 7 Watchman probes live, 14 scheduled tasks dispatched via agentId - 30-day rollback window still open

The retired vs. the replacement:

Retired vs. Replacement

Seven dimensions where the new system pays for itself—from runtime surface to routing to the memory model.

The rule I wrote for myself: No job ships without an external watcher that shares no fate with it. That’s the whole story. Two months of OpenClaw and 48 hours of cascading invisible failures reduced to one sentence I’ll never forget.

I’ll keep writing the ClaudeClaw build-out week by week—the Council orchestration pattern, the Curator autonomous publishing workflow, the voice-mode bridge, the stuff that’s too long for one article. If you want the view from inside while it’s happening, that’s what this is.

Discussion about this post

Ready for more?