Sandbox
A sandbox is an execution boundary that limits what an AI coding agent can touch — which files, processes, or network resources it can access.
A sandbox is an execution boundary that limits what an AI coding agent can access — which directories it can read or write, which processes it can spawn, which network hosts it can reach. For agentic coding CLIs like Codex CLI and Claude Code, sandboxing is how you give the agent shell access without giving it the keys to your whole machine.
Why it matters
Autonomous agents run shell commands. A confused agent with no constraints can rm -rf the wrong directory, exfiltrate credentials, or push to the wrong remote. Sandboxes turn "I trust this agent completely" into "the agent can only do what its sandbox permits" — which is the right posture for production use.
Even for solo devs, sandboxing prevents the class of mistakes where the agent hallucinates a command that happens to be destructive. SpaceSpider itself doesn't impose a sandbox — it runs whatever CLI you pick in a PTY — but most of the CLIs it hosts ship their own sandbox mechanisms.
How it works
Sandbox implementations depend on the OS:
- macOS — Seatbelt (
sandbox-exec) profiles deny syscalls and filesystem regions. Codex CLI uses this for its sandboxed execution mode. - Linux — Landlock + seccomp. Restricts directory access and kernel calls without requiring root.
- Windows — AppContainer or Job Objects, though cross-platform agent CLIs often skip fine-grained Windows sandboxing.
- Containers — Docker or Podman as a coarser sandbox for the whole agent process.
Common restrictions:
- Writes limited to the project directory (no home, no system)
- Network disabled or restricted to an allow-list
- Process spawning limited to specific binaries
- Read access limited to project + standard libs
Codex CLI exposes a full-auto mode that assumes its sandbox is active, letting the agent run commands without per-step approval.
How it's used
Typical sandbox configurations:
- Dev: project-directory-only writes, network allow-listed to package registries
- CI: strict sandbox plus no network, to keep builds reproducible
- Exploratory: loose sandbox, trust the agent, rely on checkpoints for recovery
Related terms
- Plan mode — read-only mode, a related containment tool
- Checkpoint — recovery vs. prevention
- Hook — can add custom guardrails on top of a sandbox
- Codex CLI — canonical sandbox-first CLI
- Autonomous agent — the thing being contained
FAQ
Does sandboxing slow the agent down?
Syscall filtering adds negligible overhead (microseconds). The larger hit is when the sandbox blocks a command the agent needed and it has to adapt — but that's the point.
Is SpaceSpider sandboxed?
SpaceSpider itself runs in Tauri's process model with capability-based permissions (see capabilities/default.json). The CLIs it hosts run with whatever sandbox they configure internally. We don't add an extra layer on top of each CLI.
Related terms
- Agentic codingAgentic coding is software development where an LLM-powered agent plans, edits, runs, and verifies code on its own using tools, not just autocomplete.
- AI pair programmingAI pair programming is a collaboration style where an LLM assistant sits alongside you, suggesting code and reviewing changes in real time as you work.
- ANSI escape codesANSI escape codes are control sequences that terminals interpret for colors, cursor movement, and screen clearing — the language of every modern CLI UI.
- Autonomous agentAn autonomous agent is an AI program that perceives, decides, and acts on its own toward a goal — the architecture behind modern coding CLIs.
- CheckpointA checkpoint is a saved snapshot of file state that lets you roll back an AI coding agent's changes to a known-good point.
- Claude CodeClaude Code is Anthropic's official command-line agent that plans, edits, runs, and verifies code across your repo using Claude models and tool use.