OpenClaw skill

docker-sandbox

An OpenClaw skill that provides a secure Docker-based sandbox for executing untrusted shell commands and code. It enforces resource limits on CPU and memory, disables network access, and uses ephemeral containers with no persistent storage. Containers are automatically cleaned up after execution.

View repository Open SKILL.md

Files

Review the files below to add this skill to your agents.

Security notice: review the SKILL.md file and repository content first before using any third-party skill.

_meta.jsonView
SKILL.mdView

SKILL.md content

---
name: docker-sandbox
description: Create and manage Docker sandboxed VM environments for safe agent execution. Use when running untrusted code, exploring packages, or isolating agent workloads. Supports Claude, Codex, Copilot, Gemini, and Kiro agents with network proxy controls.
metadata: {"clawdbot":{"emoji":"🐳","requires":{"bins":["docker"]},"primaryEnv":"","homepage":"https://docs.docker.com/desktop/features/sandbox/","os":["linux","darwin","win32"]}}
---

# Docker Sandbox

Run agents and commands in **isolated VM environments** using Docker Desktop's sandbox feature. Each sandbox gets its own lightweight VM with filesystem isolation, network proxy controls, and workspace mounting via virtiofs.

## When to Use

- Exploring **untrusted packages** or skills before installing them system-wide
- Running **arbitrary code** from external sources safely
- Testing **destructive operations** without risking the host
- Isolating **agent workloads** that need network access controls
- Setting up **reproducible environments** for experiments

## Requirements

- Docker Desktop 4.49+ with the `docker sandbox` plugin
- Verify: `docker sandbox version`

## Quick Start

### Create a sandbox for the current project

```bash
docker sandbox create --name my-sandbox claude .
```

This creates a VM-isolated sandbox with:
- The current directory mounted via virtiofs
- Node.js, git, and standard dev tools pre-installed
- Network proxy with allowlist controls

### Run commands inside

```bash
docker sandbox exec my-sandbox node --version
docker sandbox exec my-sandbox npm install -g some-package
docker sandbox exec -w /path/to/workspace my-sandbox bash -c "ls -la"
```

### Run an agent directly

```bash
# Create and run in one step
docker sandbox run claude . -- -p "What files are in this project?"

# Run with agent arguments after --
docker sandbox run my-sandbox -- -p "Analyze this codebase"
```

## Commands Reference

### Lifecycle

```bash
# Create a sandbox (agents: claude, codex, copilot, gemini, kiro, cagent)
docker sandbox create --name <name> <agent> <workspace-path>

# Run an agent in sandbox (creates if needed)
docker sandbox run <agent> <workspace> [-- <agent-args>...]
docker sandbox run <existing-sandbox> [-- <agent-args>...]

# Execute a command
docker sandbox exec [options] <sandbox> <command> [args...]
  -e KEY=VAL          # Set environment variable
  -w /path            # Set working directory
  -d                  # Detach (background)
  -i                  # Interactive (keep stdin open)
  -t                  # Allocate pseudo-TTY

# Stop without removing
docker sandbox stop <sandbox>

# Remove (destroys VM)
docker sandbox rm <sandbox>

# List all sandboxes
docker sandbox ls

# Reset all sandboxes
docker sandbox reset

# Save snapshot as reusable template
docker sandbox save <sandbox>
```

### Network Controls

The sandbox includes a network proxy for controlling outbound access.

```bash
# Allow specific domains
docker sandbox network proxy <sandbox> --allow-host example.com
docker sandbox network proxy <sandbox> --allow-host api.github.com

# Block specific domains
docker sandbox network proxy <sandbox> --block-host malicious.com

# Block IP ranges
docker sandbox network proxy <sandbox> --block-cidr 10.0.0.0/8

# Bypass proxy for specific hosts (direct connection)
docker sandbox network proxy <sandbox> --bypass-host localhost

# Set default policy (allow or deny all by default)
docker sandbox network proxy <sandbox> --policy deny  # Block everything, then allowlist
docker sandbox network proxy <sandbox> --policy allow  # Allow everything, then blocklist

# View network activity
docker sandbox network log <sandbox>
```

### Custom Templates

```bash
# Use a custom container image as base
docker sandbox create --template my-custom-image:latest claude .

# Save current sandbox state as template for reuse
docker sandbox save my-sandbox
```

## Workspace Mounting

The workspace path on the host is mounted into the sandbox via virtiofs. The mount path inside the sandbox preserves the host path structure:

| Host OS | Host Path | Sandbox Path |
|---|---|---|
| Windows | `H:\Projects\my-app` | `/h/Projects/my-app` |
| macOS | `/Users/me/projects/my-app` | `/Users/me/projects/my-app` |
| Linux | `/home/me/projects/my-app` | `/home/me/projects/my-app` |

The agent's home directory is `/home/agent/` with a symlinked `workspace/` directory.

## Environment Inside the Sandbox

Each sandbox VM includes:
- **Node.js** (v20.x LTS)
- **Git** (latest)
- **Python** (system)
- **curl**, **wget**, standard Linux utilities
- **npm** (global install directory at `/usr/local/share/npm-global/`)
- **Docker socket** (at `/run/docker.sock` - Docker-in-Docker capable)

### Proxy Configuration (auto-set)

```
HTTP_PROXY=http://host.docker.internal:3128
HTTPS_PROXY=http://host.docker.internal:3128
NODE_EXTRA_CA_CERTS=/usr/local/share/ca-certificates/proxy-ca.crt
SSL_CERT_FILE=/usr/local/share/ca-certificates/proxy-ca.crt
```

**Important**: Node.js `fetch` (undici) does NOT respect `HTTP_PROXY` env vars by default. For npm packages that use `fetch`, create a require hook:

```javascript
// /tmp/proxy-fix.js
const proxy = process.env.HTTPS_PROXY || process.env.HTTP_PROXY;
if (proxy) {
  const { ProxyAgent } = require('undici');
  const agent = new ProxyAgent(proxy);
  const origFetch = globalThis.fetch;
  globalThis.fetch = function(url, opts = {}) {
    return origFetch(url, { ...opts, dispatcher: agent });
  };
}
```

Run with: `node -r /tmp/proxy-fix.js your-script.js`

## Patterns

### Safe Package Exploration

```bash
# Create isolated sandbox
docker sandbox create --name pkg-test claude .

# Restrict network to only npm registry
docker sandbox network proxy pkg-test --policy deny
docker sandbox network proxy pkg-test --allow-host registry.npmjs.org
docker sandbox network proxy pkg-test --allow-host api.npmjs.org

# Install and inspect the package
docker sandbox exec pkg-test npm install -g suspicious-package
docker sandbox exec pkg-test bash -c "find /usr/local/share/npm-global/lib/node_modules/suspicious-package -name '*.js' | head -20"

# Check for post-install scripts, network calls, file access
docker sandbox network log pkg-test

# Clean up
docker sandbox rm pkg-test
```

### Persistent Dev Environment

```bash
# Create once
docker sandbox create --name dev claude ~/projects/my-app

# Use across sessions
docker sandbox exec dev npm test
docker sandbox exec dev npm run build

# Save as template for team sharing
docker sandbox save dev
```

### Locked-Down Agent Execution

```bash
# Deny-all network, allow only what's needed
docker sandbox create --name secure claude .
docker sandbox network proxy secure --policy deny
docker sandbox network proxy secure --allow-host api.openai.com
docker sandbox network proxy secure --allow-host github.com

# Run agent with restrictions
docker sandbox run secure -- -p "Review this code for security issues"
```

## Troubleshooting

### "client version X is too old"
Update Docker Desktop to 4.49+. The sandbox plugin requires engine API v1.44+.

### "fetch failed" inside sandbox
Node.js `fetch` doesn't use the proxy. Use the proxy-fix.js require hook above, or use `curl` instead:
```bash
docker sandbox exec my-sandbox curl -sL https://api.example.com/data
```

### Path conversion on Windows (Git Bash / MSYS2)
Git Bash converts `/path` to `C:/Program Files/Git/path`. Prefix commands with:
```bash
MSYS_NO_PATHCONV=1 docker sandbox exec my-sandbox ls /home/agent
```

### Sandbox won't start after Docker update
```bash
docker sandbox reset  # Clears all sandbox state
```

How this skill works

The skill executes user-provided code inside an isolated Docker container
It supports multiple programming languages via pre-built Docker images
Containers are launched with strict resource limits: 1 CPU core, 512MB RAM, 10s timeout
Network access is disabled (--network=none)
Code is written to a temporary file in /tmp inside the container
The container runs a specific entrypoint command based on the language
Stdout, stderr, and exit code are captured and returned to the agent
The container is always removed (--rm) after execution

When to use it

When executing arbitrary shell commands in an isolated Docker container
When running untrusted code safely without host system access
When testing scripts or binaries in a resource-limited environment

Best practices

Ensure Docker is installed and the daemon is running before using the skill
Configure resource limits (MAX_MEMORY, MAX_CPU, TIMEOUT) to prevent abuse
Use non-root user (1000:1000) inside containers for security
Mount /tmp as tmpfs and avoid unnecessary volume mounts
Disable network access (--network none) by default
Always use --rm flag to automatically remove containers after execution
Select appropriate DOCKER_IMAGE (e.g., language-specific slim variants)

Example use cases

Executing arbitrary commands in isolation: Run shell commands or scripts inside a Docker container to ensure host system safety, capturing stdout and stderr.
Running language-specific code interpreters: Execute code using Docker images like python:3.12-slim for Python scripts or node:20-alpine for JavaScript, in an ephemeral container.
Safe computation and data processing: Perform calculations or process data (e.g., using bc, awk, or custom scripts) without risking the host environment.

FAQs

What is the docker-sandbox skill?

A skill that provides a secure Docker-based sandbox for executing arbitrary code safely.

What is the main tool provided by this skill?

docker_sandbox_exec

What parameters does docker_sandbox_exec accept?

language (str), code (str), timeout (int, default 10), memory (str, default '256m'), cpu (str, default '0.5')

What languages are supported by default?

python3, node18, deno1, bun1, go1, rust1

What are the resource limits for the sandbox?

Memory: 256m (configurable), CPU: 0.5 (configurable), Timeout: 10s (configurable)

Does the sandbox have internet access?

What is returned by successful execution?

stdout and stderr as strings

What happens on execution failure?

Returns error message with exit code

More similar skills to explore

achurch
An OpenClaw skill for church administration that handles member management, event scheduling, sermon retrieval, and donation processing. It provides tools to list members, add new members, schedule events, fetch sermons, and record donations.
agent-config
An OpenClaw skill that enables agents to manage their configuration by loading from files, environment variables, or remote sources. It supports retrieving, setting, and validating configuration values. The skill allows for hot-reloading of configurations.
agent-council
An OpenClaw skill named agent-council that enables the primary agent to summon a council of specialized sub-agents for deliberating on tasks. The council members discuss the query from unique perspectives, propose solutions, and vote to select the best response. The skill outputs the winning proposal with supporting rationale from the council.
agent-identity-kit
An OpenClaw skill that equips agents with tools to craft, manage, and evolve digital identities, including generating personas, bios, avatars, and communication styles. It supports creating detailed agent personas with name, background, goals, personality traits; crafting bios for specific platforms; designing avatars; tuning voice and style; and adapting identities to new contexts.
agenticflow-skill
An OpenClaw skill that provides tools for interacting with Agentic Flow. The tools enable agents to create agentic flows with defined tasks, execute existing flows, and retrieve flow status and outputs.
agentlens
AgentLens is an OpenClaw skill that enables agents to inspect the internal cognition and actions of other agents. It provides visibility into reasoning traces (thoughts), tool calls and arguments, retrieved memories, and response generation. The skill supports analysis in multi-agent conversations via the "inspect" action targeting a specific agent.