Sandcastle: Run Parallel AI Coding Agents in Docker Worktrees
Running an AI coding agent unattended means trusting it not to corrupt your working directory. Most developer setups skip that part — the agent writes wherever the model tells it to. Sandcastle solves the problem at the file system level: every agent gets its own git worktree mounted inside a Docker container, and the only thing that touches your host is a merged commit when the agent finishes.
The library is built by Matt Pocock (creator of ts-error-translator and total-typescript), released under MIT, and published on npm as @ai-hero/sandcastle. This article covers the inspected API surface and what it means for teams running parallel AI coding workflows.
data/lab-runs/sandcastle-typescript-agent-docker-sandbox-poc-2026.md.
The Problem It Solves
When you run a coding agent in your working tree, it competes with your in-progress edits. Two agents on the same repo at the same time produce merge conflicts, overwritten files, and failed tests that are hard to attribute. The standard workaround — clone a fresh copy per agent — requires disk space, network, and a new npm/pip install for every run.
Sandcastle's approach: use git worktrees as the isolation primitive. A worktree is a linked checkout that shares the object store with your main repo. Each agent gets a worktree on its own branch, inside a Docker bind-mount. The container sees the files; the host sees only the final committed diff.
What Was Verified
Effloow Lab inspected @ai-hero/sandcastle via npm info --json on 2026-06-04:
- Version: 0.7.0
- License: MIT
- Published: May 30, 2026
- Unpackaged size: 14.6 MB
- Direct dependency:
@clack/prompts ^1.1.0(interactive CLI prompts — 1 dep total) - Peer dependencies:
@daytona/sdk ^0.164.0,@vercel/sandbox >=1.0.0 - Maintainer: mpocock (published via GitHub Actions OIDC)
The package exports map confirmed five sandbox providers:
. → sandcastle.run() main API
./sandboxes/docker → Docker provider
./sandboxes/vercel → Vercel sandbox provider
./sandboxes/podman → Podman provider
./sandboxes/daytona → Daytona cloud provider
./sandboxes/no-sandbox → run agent directly on host (no isolation)
The Core API: sandcastle.run()
The library exports three primary functions:
import { run, createSandbox, interactive } from "@ai-hero/sandcastle";
import { DockerSandbox } from "@ai-hero/sandcastle/sandboxes/docker";
await run({
prompt: "Add a Jest test for the parseDate function in src/utils.ts",
provider: new DockerSandbox({
image: "node:22-alpine",
branchStrategy: "merge-to-head",
}),
maxIterations: 30,
completionSignal: "TASK_COMPLETE",
idleTimeoutSeconds: 120,
});
run() handles the full lifecycle: boot the container, set up the worktree, hand the agent its context, collect commits, and merge back. When it returns, your main branch has the diff without any intermediate state landing in your working tree.
createSandbox() gives you lower-level lifecycle control — useful for running setup scripts before the agent starts. interactive() drops you into a shell inside the sandbox for debugging.
Branch Strategies
Three strategies control where agent commits land:
| Strategy | Behavior | When to Use |
|---|---|---|
head |
Agent writes directly to the host worktree | Local dev, single agent, manual review |
merge-to-head |
Temp branch, auto-merged to main on completion | CI pipelines, automated review |
branch |
Commits land on a named branch, no auto-merge | Multi-agent parallel runs with explicit merge step |
For parallel agent workflows, branch is the right choice — each agent gets its own named branch, and you run your merge + review step after all agents finish.
Lifecycle Hooks
Two hook points let you run setup or teardown commands either on the host or inside the container:
new DockerSandbox({
image: "node:22-alpine",
branchStrategy: "merge-to-head",
hooks: {
onWorktreeReady: {
host: "echo 'worktree ready on host'",
timeout: 5000,
},
onSandboxReady: {
sandbox: "npm install",
timeout: 60000,
},
},
});
onWorktreeReady fires after the git worktree is created. onSandboxReady fires after the container boots. This is where you install dependencies, seed test fixtures, or copy environment files — without those steps reaching your host environment.
Agent Providers
Sandcastle is provider-agnostic. The orchestration layer does not care which agent model runs inside the container — it only reads commits and output tokens. GitHub Issues confirm Claude, Codex (via OpenCode), and Pi (the xAI model) are all targeted as first-class providers.
GitHub Issue #233 describes the OpenCode agent provider PRD with Codex and Pi parity as a stated goal. Issue #583 documents a thinking option being added to the Pi provider — evidence that the project actively maintains multi-model support.
The no-sandbox Provider
For local development where Docker is unavailable or the overhead is not justified, the no-sandbox export runs the agent directly on the host. It accepts the same interface as all other providers, so switching from no-sandbox to DockerSandbox in CI requires one import change.
import { NoSandbox } from "@ai-hero/sandcastle/sandboxes/no-sandbox";
await run({
prompt: "...",
provider: new NoSandbox({ branchStrategy: "branch" }),
});
CLI: sandcastle init
The CLI scaffolds the .sandcastle/ configuration directory:
sandcastle init
During init, you choose one of five workflow templates:
- blank — empty template, configure everything manually
- simple-loop — single agent, loop until completion signal
- sequential-reviewer — one agent codes, a second agent reviews
- parallel-planner — planner agent fans out to N workers
- parallel-planner-with-review — parallel-planner plus a review pass
After init, sandcastle docker build-image builds the container image from the generated Dockerfile. Re-run it whenever you update the Dockerfile.
Practical Patterns
| Scenario | Provider | Branch Strategy | Template |
|---|---|---|---|
| Exploratory local dev | no-sandbox |
head |
blank |
| Single automated task | DockerSandbox |
merge-to-head |
simple-loop |
| Parallel feature agents | DockerSandbox |
branch |
parallel-planner |
| Agent + code review | DockerSandbox |
merge-to-head |
parallel-planner-with-review |
| Cloud / serverless | VercelSandbox |
branch |
blank |
The parallel-planner template is where Sandcastle earns its name. A planner agent reads the task, breaks it into subtasks, and hands each subtask to a worker agent in a separate worktree. Workers run concurrently. The planner collects their branches and merges. Effloow Lab did not run this workflow (no Docker environment), but the exported API surface and template scaffolding confirm it as a designed-in pattern, not a theoretical possibility.
What Is Still Missing
A few gaps are worth noting:
- No built-in test runner integration — confirming that an agent's changes pass tests requires a hook or post-merge CI step. Sandcastle does not run tests natively.
- No agent result scoring — there is no built-in pass/fail verdict for agent output. Callers must implement their own review logic.
- Docker required for isolation — the
no-sandboxprovider offers no isolation. For true security, Docker or Podman must be available in the runtime environment. - Daytona SDK peer dep —
@daytona/sdk ^0.164.0is a peer dependency, meaning Daytona provider support requires a separate install and a Daytona workspace subscription.
Verdict
Sandcastle fills a real gap: git worktree isolation for parallel coding agents without cloning the full repo per agent. The API surface is clean, the provider abstraction is solid, and the five workflow templates cover most production patterns. The main constraint is Docker availability — without it you fall back to no-sandbox which provides no file system isolation.
For teams running GitHub Copilot Workspace, Codex cloud agents, or Claude Code background sessions in CI, Sandcastle gives you the local orchestration equivalent with a single sandcastle.run() call.
FAQ
Does sandcastle work with Claude Code?
There is no direct Claude Code integration in the exported API, but any agent that writes commits is compatible. You pass the agent's executable or subprocess call to the provider config, and Sandcastle collects commits. The no-sandbox provider can wrap Claude Code CLI invocations.
Can I run Sandcastle in GitHub Actions?
Yes. Docker is available in GitHub Actions (ubuntu-latest runners). Use DockerSandbox with branchStrategy: "branch" and push the resulting branches for PR review.
Is this the same as Claude Code's --bg flag?
No. Claude Code's background mode runs Claude sessions on Anthropic's infrastructure. Sandcastle is an on-premise orchestration layer that runs agents inside your own Docker containers. The two can be used together if your Claude Code session writes commits that Sandcastle then manages.
What models are supported?
The library is model-agnostic — it does not call any LLM API directly. You configure which agent binary runs inside the container. Claude, Codex (via OpenCode), Pi, and any other CLI-driven agent are all compatible. GitHub Issue #233 documents OpenCode parity as an active development goal.
What's the difference between sandcastle and E2B?
E2B is a hosted cloud sandbox-as-a-service with a Python/TypeScript SDK. Sandcastle is a self-hosted TypeScript library that uses Docker or Podman on your own infrastructure. Sandcastle gives you more control over the host environment and costs nothing beyond Docker; E2B adds managed provisioning and removes the Docker dependency.
Effloow Lab inspected @ai-hero/sandcastle@0.7.0 via npm metadata and official documentation on 2026-06-04. No Docker container was run in this environment. Evidence at data/lab-runs/sandcastle-typescript-agent-docker-sandbox-poc-2026.md.
Need content like this
for your blog?
We run AI-powered technical blogs. Start with a free 3-article pilot.