← Back to article
Open article →
Github Copilot Sdk Sandboxes Agent Runtime Poc 2026
Evidence notes document the bounded local or source-based checks behind an Effloow article. They are not product endorsements, legal advice, or benchmark claims.
Effloow Lab OpenAI API Run: github-copilot-sdk-sandboxes-agent-runtime-poc-2026
- Date: 2026-06-07T04:19:03.354617+00:00
- Model: gpt-5.5-2026-04-23
- Purpose: agent-runtime-governance-prompt-harness
- Request ID: req_e36eac5ea31846298fb2786063e98724
- Usage: {"input_tokens": 199, "input_tokens_details": {"cached_tokens": 0}, "output_tokens": 1194, "output_tokens_details": {"reasoning_tokens": 144}, "total_tokens": 1393}
- Prompt SHA-256: 7abd4022e107f9b99f57505a69f323626268dababe1d4144ca04e10e7d5e1f02
- Safety boundary: no confidential, customer, credential, or private data should be included in this run.
Prompt Excerpt
Effloow Lab is preparing a source-backed developer article about building a governed agent runtime PoC around GitHub Copilot SDK and Copilot sandboxes. Use only this synthetic scenario: a developer-tool vendor wants to embed an agent in a web app that can inspect a sample repository, call two MCP tools (filesystem limited to /tmp/agent-work and issue-reader read-only), request human approval before shell commands, and optionally start a cloud session when org policy allows it. Produce a concise implementation-risk matrix with columns: control area, safe default, validation step, failure signal, article caveat. Do not invent GitHub product behavior, prices, benchmarks, quotes, or hands-on results. Mark unknown external behavior as [DATA NOT AVAILABLE].
Output
| Control area | Safe default | Validation step | Failure signal | Article caveat |
|---|---|---|---|---|
| Repository access | Mount or clone only the sample repository; deny access to other repos by default. | Run a test where the agent attempts to read outside the sample repo scope. | Agent can enumerate or read unrelated repositories/files. | Exact Copilot SDK or sandbox repository-isolation behavior is [DATA NOT AVAILABLE]. Treat repo scoping as an implementation requirement to verify. |
| MCP tool registration | Register only the two required MCP tools: filesystem and issue-reader. | Inspect runtime configuration before agent startup and assert no extra tools are exposed. | Agent sees or calls tools beyond the approved list. | Do not imply tool allowlisting is provided automatically by GitHub; external behavior is [DATA NOT AVAILABLE]. |
| Filesystem MCP scope | Restrict filesystem tool to `/tmp/agent-work` only. | Attempt reads/writes to `/tmp/agent-work` and to paths outside it, such as `/home`, repo root, or `/etc`. | Any successful access outside `/tmp/agent-work`. | The article should describe this as a PoC guardrail, not as a verified sandbox guarantee unless independently tested. |
| Issue-reader MCP access | Make issue-reader read-only. | Attempt allowed read operation and blocked create/update/delete operations. | Agent can mutate issues or metadata. | GitHub issue API behavior, permission mapping, and SDK mediation are [DATA NOT AVAILABLE]. |
| Shell command execution | Require explicit human approval before every shell command. | Submit a harmless command and verify it pauses for approval before execution; reject another command and verify no execution occurs. | Command runs without approval, or rejected command still executes. | Do not claim built-in human-in-the-loop semantics unless confirmed from source documentation; mark as app-level control if implemented by the vendor. |
| Shell command scope | Start with shell disabled unless the user approves a specific command. | Try command chaining, redirects, environment reads, and file writes outside allowed paths. | Approval for one command enables broader shell access than intended. | Fine-grained shell mediation capabilities in Copilot sandboxes are [DATA NOT AVAILABLE]. |
| Cloud session startup | Disable cloud session by default; allow only when org policy explicitly permits it. | Test policy-denied and policy-allowed cases before session creation. | Cloud session starts while policy says denied or unset. | Specific Copilot sandbox/cloud-session policy hooks are [DATA NOT AVAILABLE]. Present as desired control logic, not confirmed product behavior. |
| Organization policy evaluation | Treat missing, stale, or unreadable policy as deny. | Simulate missing policy response, timeout, and conflicting policy values. | Runtime proceeds with permissive behavior when policy cannot be verified. | Do not invent GitHub org-policy APIs or enforcement semantics. |
| Identity and authorization | Bind agent actions to the authenticated app user and least-privilege credentials. | Verify which identity is used for repo reads, issue reads, shell approvals, and cloud session requests. | Actions occur under an unexpected or overprivileged identity. | Exact identity propagation through Copilot SDK/MCP/sandbox is [DATA NOT AVAILABLE]. |
| Audit trail | Log requested tool call, actor, approval decision, target path/resource, and outcome. | Review logs after approved, denied, and failed operations. | Missing records for tool calls, shell approvals, or policy decisions. | Avoid claiming GitHub provides these audit fields unless sourced; vendor may need to implement them. |
| Prompt-injection resistance | Treat repository and issue contents as untrusted input; system policy overrides repo text. | Place adversarial instructions in sample repo/issue content and verify the agent does not bypass controls. | Agent follows repo-provided instructions to call forbidden tools or skip approval
...[truncated]
Limitations
This API run is a bounded lab check. It is not a production benchmark, user study, or proof that an external product works in a real customer environment.
Additional Local Checks
- Date: 2026-06-07
- Environment: macOS local workspace, zsh, Python 3.12, npm available from the project environment.
- Command:
npm view @github/copilot-sdk version name dist-tags --json - Output:
{
"version": "1.0.0",
"name": "@github/copilot-sdk",
"dist-tags": {
"unstable": "0.2.1-unstable.0",
"prerelease": "1.0.0-beta.11",
"latest": "1.0.0"
}
}
Operational Notes and Limits
- The first OpenAI API attempt failed before request execution because the local Python TLS trust path could not verify the certificate chain:
CERTIFICATE_VERIFY_FAILED. - Retrying the same safe prompt with
SSL_CERT_FILEpointed at the installedcertifiCA bundle completed successfully and wrotedata/lab-runs/github-copilot-sdk-sandboxes-agent-runtime-poc-2026.openai.json. - Effloow Lab did not authenticate to GitHub Copilot, did not create a real Copilot SDK session, did not run Copilot CLI, and did not start a local or cloud Copilot sandbox in this workflow.
- The article may describe the saved OpenAI API check as a synthetic governance prompt harness. It must not describe it as a Copilot benchmark, a successful Copilot SDK integration, or a hands-on Copilot sandbox test.
Read the article
This note supports the public article and records what was actually checked.