ARTICLES ·2026-06-04 ·BY EFFLOOW CONTENT FACTORY

Sandcastle: Run Parallel AI Coding Agents in Docker Worktrees

Sandcastle gives each AI coding agent an isolated Docker worktree with a single sandcastle.run() call — no file sync, no contamination.

ai-agents docker typescript coding-agents sandboxing

Sandcastle: Run Parallel AI Coding Agents in Docker Worktrees

Running an AI coding agent unattended means trusting it not to corrupt your working directory. Most developer setups skip that part — the agent writes wherever the model tells it to. Sandcastle solves the problem at the file system level: every agent gets its own git worktree mounted inside a Docker container, and the only thing that touches your host is a merged commit when the agent finishes.

The library is built by Matt Pocock (creator of ts-error-translator and total-typescript), released under MIT, and published on npm as @ai-hero/sandcastle. This article covers the inspected API surface and what it means for teams running parallel AI coding workflows.

Effloow Lab note: @ai-hero/sandcastle@0.7.0 was inspected via npm metadata and official documentation on 2026-06-04. No Docker container was started and no agent was invoked in this lab run. Evidence is at data/lab-runs/sandcastle-typescript-agent-docker-sandbox-poc-2026.md.

The Problem It Solves

When you run a coding agent in your working tree, it competes with your in-progress edits. Two agents on the same repo at the same time produce merge conflicts, overwritten files, and failed tests that are hard to attribute. The standard workaround — clone a fresh copy per agent — requires disk space, network, and a new npm/pip install for every run.

Sandcastle's approach: use git worktrees as the isolation primitive. A worktree is a linked checkout that shares the object store with your main repo. Each agent gets a worktree on its own branch, inside a Docker bind-mount. The container sees the files; the host sees only the final committed diff.

What Was Verified

Effloow Lab inspected @ai-hero/sandcastle via npm info --json on 2026-06-04:

Version: 0.7.0
License: MIT
Published: May 30, 2026
Unpackaged size: 14.6 MB
Direct dependency: @clack/prompts ^1.1.0 (interactive CLI prompts — 1 dep total)
Peer dependencies: @daytona/sdk ^0.164.0, @vercel/sandbox >=1.0.0
Maintainer: mpocock (published via GitHub Actions OIDC)

The package exports map confirmed five sandbox providers:

.                      → sandcastle.run() main API
./sandboxes/docker     → Docker provider
./sandboxes/vercel     → Vercel sandbox provider  
./sandboxes/podman     → Podman provider
./sandboxes/daytona    → Daytona cloud provider
./sandboxes/no-sandbox → run agent directly on host (no isolation)

The Core API: sandcastle.run()

The library exports three primary functions:

import { run, createSandbox, interactive } from "@ai-hero/sandcastle";
import { DockerSandbox } from "@ai-hero/sandcastle/sandboxes/docker";

await run({
  prompt: "Add a Jest test for the parseDate function in src/utils.ts",
  provider: new DockerSandbox({
    image: "node:22-alpine",
    branchStrategy: "merge-to-head",
  }),
  maxIterations: 30,
  completionSignal: "TASK_COMPLETE",
  idleTimeoutSeconds: 120,
});

run() handles the full lifecycle: boot the container, set up the worktree, hand the agent its context, collect commits, and merge back. When it returns, your main branch has the diff without any intermediate state landing in your working tree.

createSandbox() gives you lower-level lifecycle control — useful for running setup scripts before the agent starts. interactive() drops you into a shell inside the sandbox for debugging.

Branch Strategies

Three strategies control where agent commits land:

Strategy	Behavior	When to Use
`head`	Agent writes directly to the host worktree	Local dev, single agent, manual review
`merge-to-head`	Temp branch, auto-merged to main on completion	CI pipelines, automated review
`branch`	Commits land on a named branch, no auto-merge	Multi-agent parallel runs with explicit merge step

For parallel agent workflows, branch is the right choice — each agent gets its own named branch, and you run your merge + review step after all agents finish.

Lifecycle Hooks

Two hook points let you run setup or teardown commands either on the host or inside the container:

new DockerSandbox({
  image: "node:22-alpine",
  branchStrategy: "merge-to-head",
  hooks: {
    onWorktreeReady: {
      host: "echo 'worktree ready on host'",
      timeout: 5000,
    },
    onSandboxReady: {
      sandbox: "npm install",
      timeout: 60000,
    },
  },
});

onWorktreeReady fires after the git worktree is created. onSandboxReady fires after the container boots. This is where you install dependencies, seed test fixtures, or copy environment files — without those steps reaching your host environment.

Agent Providers

Sandcastle is provider-agnostic. The orchestration layer does not care which agent model runs inside the container — it only reads commits and output tokens. GitHub Issues confirm Claude, Codex (via OpenCode), and Pi (the xAI model) are all targeted as first-class providers.

GitHub Issue #233 describes the OpenCode agent provider PRD with Codex and Pi parity as a stated goal. Issue #583 documents a thinking option being added to the Pi provider — evidence that the project actively maintains multi-model support.

The no-sandbox Provider

For local development where Docker is unavailable or the overhead is not justified, the no-sandbox export runs the agent directly on the host. It accepts the same interface as all other providers, so switching from no-sandbox to DockerSandbox in CI requires one import change.

import { NoSandbox } from "@ai-hero/sandcastle/sandboxes/no-sandbox";

await run({
  prompt: "...",
  provider: new NoSandbox({ branchStrategy: "branch" }),
});

CLI: sandcastle init

The CLI scaffolds the .sandcastle/ configuration directory:

sandcastle init

During init, you choose one of five workflow templates:

blank — empty template, configure everything manually
simple-loop — single agent, loop until completion signal
sequential-reviewer — one agent codes, a second agent reviews
parallel-planner — planner agent fans out to N workers
parallel-planner-with-review — parallel-planner plus a review pass

After init, sandcastle docker build-image builds the container image from the generated Dockerfile. Re-run it whenever you update the Dockerfile.

Practical Patterns

Scenario	Provider	Branch Strategy	Template
Exploratory local dev	`no-sandbox`	`head`	blank
Single automated task	`DockerSandbox`	`merge-to-head`	simple-loop
Parallel feature agents	`DockerSandbox`	`branch`	parallel-planner
Agent + code review	`DockerSandbox`	`merge-to-head`	parallel-planner-with-review
Cloud / serverless	`VercelSandbox`	`branch`	blank

The parallel-planner template is where Sandcastle earns its name. A planner agent reads the task, breaks it into subtasks, and hands each subtask to a worker agent in a separate worktree. Workers run concurrently. The planner collects their branches and merges. Effloow Lab did not run this workflow (no Docker environment), but the exported API surface and template scaffolding confirm it as a designed-in pattern, not a theoretical possibility.

What Is Still Missing

A few gaps are worth noting:

No built-in test runner integration — confirming that an agent's changes pass tests requires a hook or post-merge CI step. Sandcastle does not run tests natively.
No agent result scoring — there is no built-in pass/fail verdict for agent output. Callers must implement their own review logic.
Docker required for isolation — the no-sandbox provider offers no isolation. For true security, Docker or Podman must be available in the runtime environment.
Daytona SDK peer dep — @daytona/sdk ^0.164.0 is a peer dependency, meaning Daytona provider support requires a separate install and a Daytona workspace subscription.

Verdict

Verdict: Strong tool for teams parallelizing AFK coding agents

Sandcastle fills a real gap: git worktree isolation for parallel coding agents without cloning the full repo per agent. The API surface is clean, the provider abstraction is solid, and the five workflow templates cover most production patterns. The main constraint is Docker availability — without it you fall back to no-sandbox which provides no file system isolation.

For teams running GitHub Copilot Workspace, Codex cloud agents, or Claude Code background sessions in CI, Sandcastle gives you the local orchestration equivalent with a single sandcastle.run() call.

FAQ

Does sandcastle work with Claude Code?

There is no direct Claude Code integration in the exported API, but any agent that writes commits is compatible. You pass the agent's executable or subprocess call to the provider config, and Sandcastle collects commits. The no-sandbox provider can wrap Claude Code CLI invocations.

Can I run Sandcastle in GitHub Actions?

Yes. Docker is available in GitHub Actions (ubuntu-latest runners). Use DockerSandbox with branchStrategy: "branch" and push the resulting branches for PR review.

Is this the same as Claude Code's `--bg` flag?

No. Claude Code's background mode runs Claude sessions on Anthropic's infrastructure. Sandcastle is an on-premise orchestration layer that runs agents inside your own Docker containers. The two can be used together if your Claude Code session writes commits that Sandcastle then manages.

What models are supported?

The library is model-agnostic — it does not call any LLM API directly. You configure which agent binary runs inside the container. Claude, Codex (via OpenCode), Pi, and any other CLI-driven agent are all compatible. GitHub Issue #233 documents OpenCode parity as an active development goal.

What's the difference between sandcastle and E2B?

E2B is a hosted cloud sandbox-as-a-service with a Python/TypeScript SDK. Sandcastle is a self-hosted TypeScript library that uses Docker or Podman on your own infrastructure. Sandcastle gives you more control over the host environment and costs nothing beyond Docker; E2B adds managed provisioning and removes the Docker dependency.

Effloow Lab inspected @ai-hero/sandcastle@0.7.0 via npm metadata and official documentation on 2026-06-04. No Docker container was run in this environment. Evidence at data/lab-runs/sandcastle-typescript-agent-docker-sandbox-poc-2026.md.

Need content like this
for your blog?

We run AI-powered technical blogs. Start with a free 3-article pilot.

Learn more →

The Problem It Solves

What Was Verified

The Core API: sandcastle.run()

Branch Strategies

Lifecycle Hooks

Agent Providers

The no-sandbox Provider

CLI: sandcastle init

Practical Patterns

What Is Still Missing

Verdict

FAQ

Does sandcastle work with Claude Code?

Can I run Sandcastle in GitHub Actions?

Is this the same as Claude Code's --bg flag?

What models are supported?

What's the difference between sandcastle and E2B?

Need content like thisfor your blog?

More in Articles

Stay in the loop.

Get weekly AI tool reviews & automation tips

Stay in the loop

Is this the same as Claude Code's `--bg` flag?

Need content like this
for your blog?