Openclaw deletes the entire mailbox of the Director of Safety and Alignment at Meta Superintelligence Labs 🦞

Why OpenClaw went rogue on her mailbox and 10 ways to ensure this never happens to you.

Dr. Ashish Bamania

Mar 28, 2026

☀️ It’s a beautiful February day in 2026, and everything seems to be working well for Summer Yue.

Then, suddenly, OpenClaw starts deleting important emails from her main mailbox!

For context, Summer Yue is the director of Safety and Alignment at Meta Superintelligence Labs.

And, this was despite her instructing OpenClaw to “always confirm before acting”.

She panicked and ordered OpenClaw to stop, but the bot was completely out of control.

When the bot didn’t listen, and she couldn’t stop the ongoing mailbox deletion using her phone, she ran to her Mac mini, where her OpenClaw lived, and terminated all the processes.

She describes it in her own words:

“I couldn’t stop it from my phone. I had to RUN to my Mac mini like I was defusing a bomb.”

Summer was previously working with OpenClaw using a toy inbox, and the bot functioned really well. But that unfortunate day, she decided to use OpenClaw on her real mailbox with the following prompt:

“Check this inbox too and suggest what you would archive or delete, don’t action until I tell you to.”

And that’s what made the bot go completely haywire!

But why? 🤔

A Little Intro to Context Management

To understand what went wrong, we need to understand how the LLM that acts as OpenClaw's brain manages its context.

An LLM uses its context window to store instructions and previous messages from an ongoing conversation. The context window functions as its short-term or working memory.

Although massive, these context windows are limited spaces that can fill up very quickly as the conversation grows.

Image from the author’s book titled ‘**LLMs In 100 Images.**’

As the context window approaches its end, response quality declines because the large amount of stored information makes it harder for the LLM to attend to it all.

An LLM tends to remember the beginning and end well, but starts forgetting the information buried in the middle of its context window. This shows up as hallucinations or the LLM ignoring instructions.

Once the context window is fully filled, the LLM stops answering user queries.

To ensure that such context window exhaustion is delayed as much as possible, previous conversation history goes through either:

Trimming: Where only a certain number of the most recent conversations are kept, and the previous ones are discarded before passing to an LLM at every turn.

Trimming conversations to keep only the last 2 recent ones and discarding the previous. (Other context for the LLM is omitted from the image for clarity.)

Summarization/ Compaction: Where the previous conversations are summarized and passed to the LLM at each turn, rather than the complete conversation history.

Summarise all conversations using the same LLM or another one, and give this summary to the LLM at each turn rather than the complete conversation history. (Other context for the LLM is omitted from the image for clarity.)

Pruning: Where old tool results (file reads, API responses, browser results) are selectively removed from the context while leaving all user queries and LLM responses untouched.

So What Really Went Wrong With OpenClaw?

Since Summer was previously using a small toy mailbox with OpenClaw, things worked pretty well. But when she moved the bot to use her actual mailbox, which was quite large, the context window filled up quickly, triggering Context compaction.

This led the bot to lose its original safety instruction on not taking action without confirmation, resulting in the deletion of hundreds of important emails.

Not really! It will still happen again if you don’t do the following.

10 Tips to Not End Up in the Same Place

If the director of Safety and Alignment at Meta Superintelligence Labs can make this mistake, this might happen to you as well. But there are ways to make sure that you do not end up in the same place. Here are 10 of them.

Put your security instructions in the MEMORY.md file rather than just in the conversation. This file is not compacted and acts as a persistent memory for the OpenClaw.
Then, you have the AGENTS.md file that is the standard operating and procedural rulebook that defines operational rules, security policies, and scope boundaries for OpenClaw to follow. Like MEMORY.md, it is loaded into context at the start of every session and survives compaction. Add an instruction here for the bot to check memory and get explicit confirmation before performing any destructive action.
OpenClaw can build a small vector index over MEMORY.md and memory/*.md so that semantic queries can find related notes even when the wording differs. Make sure that this is enabled in the ~/.openclaw/openclaw.json file.

{
  "agents": {
    "defaults": {
      "memorySearch": {
        "enabled": true,
        "provider": "local"
      }
    }
  }
}

OpenClaw has a mechanism called Pre-compaction memory flush that is automatically triggered in the background as the context window is about to fill up. This saves all the important details from context to disk before compaction starts.

To enable this, go to the ~/.openclaw/openclaw.json file and make sure it contains the following:

{
  "agents": {
    "defaults": {
      "compaction": {
        "reserveTokensFloor": 20000, // Tokens reserved as buffer before compaction triggers
        "memoryFlush": {
          "enabled": true, 
          "softThresholdTokens": 4000, // // How far before the reserve floor the flush triggers
          "systemPrompt": "Session nearing compaction. Store durable memories now.", 
          "prompt": "Write any lasting notes to memory/YYYY-MM-DD.md; reply with NO_REPLY if nothing to store."  
        }
      }
    }
  }
}

Enable Context pruning, which removes old tool results from the context before sending to the bot. Add the following to the ~/.openclaw/openclaw.json file:

{
  “agents”: {
    “defaults”: {
      “contextPruning”: {
        “mode”: “cache-ttl”, // Automatically remove tool results after TTL
        “ttl”: “6h” // Keep tool results for 6 hours
      }
    }
  }
}

Use the /context list or /context detail commands in chat regularly to learn what is in the bot's context window and whether any contextual files are truncated or missing.
Always save important instructions manually to MEMORY.md file by typing “Save this to MEMORY.md“ in your chat with OpenClaw.
Don’t wait for compaction to happen automatically. Instead, perform it manually. Use the /compact command in chat when sessions feel stale, or context is getting bloated, and then give your new instructions to the bot.
If you still end up with OpenClaw misbehaving, sending /stop in the chat aborts the session and stops any active sub-agent runs spawned from it, including the nested children.
Last but not least, if none of the above works, rush to the machine running OpenClaw and pull the power plug. This should most definitely stop it.

A very helpful lesson on OpenClaw memory can be found here if you’re interested in learning about it in more detail.

P.S. This post isn't meant to mock Summer Yue, but it’s quite the opposite. She is one of the very few people who openly admitted a mistake in public that most would hide, given her job role. Thanks to her, we're all better engineers because she chose transparency over ego.

Join the paid tier today to get access to all posts on this newsletter:

and so many more!

You can also read my books on Gumroad and connect with me on LinkedIn to stay in touch.

Rob Spence

Mar 29

Oh boy! This is exactly why I didn’t use Openclaw and spent the last 2 months meticulously hand building my AI Chief Of Staff context and knowledge management. Some fun experiences where letting context consume my computers entire memory, tried to have better context management by evicting low quality messages only to discover I was giving it amnesia and finally the day where I learned the concept of cache busting intimately

3 replies by Dr. Ashish Bamania and others

Giving Lab

Mar 28

Brutal case study — and super relevant.

What stood out is that the real failure mode isn’t just “AI made a bad move,” it’s missing an execution receipt before high-impact actions. We’ve been using a lightweight 5-line check (goal → tool calls → failure signal → recovery step → next test) to catch this kind of drift earlier.

If that’s useful for your readers, Giving Lab shares practical OpenClaw run breakdowns in that format: https://substack.com/@givinglab

1 reply by Dr. Ashish Bamania

5 more comments...

Discussion about this post

Ready for more?