Effloow Lab OpenAI API Run: github-copilot-sdk-sandboxes-agent-runtime-poc-2026

Date: 2026-06-07T04:19:03.354617+00:00
Model: gpt-5.5-2026-04-23
Purpose: agent-runtime-governance-prompt-harness
Request ID: req_e36eac5ea31846298fb2786063e98724
Usage: {"input_tokens": 199, "input_tokens_details": {"cached_tokens": 0}, "output_tokens": 1194, "output_tokens_details": {"reasoning_tokens": 144}, "total_tokens": 1393}
Prompt SHA-256: 7abd4022e107f9b99f57505a69f323626268dababe1d4144ca04e10e7d5e1f02
Safety boundary: no confidential, customer, credential, or private data should be included in this run.

Prompt Excerpt

Effloow Lab is preparing a source-backed developer article about building a governed agent runtime PoC around GitHub Copilot SDK and Copilot sandboxes. Use only this synthetic scenario: a developer-tool vendor wants to embed an agent in a web app that can inspect a sample repository, call two MCP tools (filesystem limited to /tmp/agent-work and issue-reader read-only), request human approval before shell commands, and optionally start a cloud session when org policy allows it. Produce a concise implementation-risk matrix with columns: control area, safe default, validation step, failure signal, article caveat. Do not invent GitHub product behavior, prices, benchmarks, quotes, or hands-on results. Mark unknown external behavior as [DATA NOT AVAILABLE].

Output

| Control area | Safe default | Validation step | Failure signal | Article caveat |
|---|---|---|---|---|
| Repository access | Mount or clone only the sample repository; deny access to other repos by default. | Run a test where the agent attempts to read outside the sample repo scope. | Agent can enumerate or read unrelated repositories/files. | Exact Copilot SDK or sandbox repository-isolation behavior is [DATA NOT AVAILABLE]. Treat repo scoping as an implementation requirement to verify. |
| MCP tool registration | Register only the two required MCP tools: filesystem and issue-reader. | Inspect runtime configuration before agent startup and assert no extra tools are exposed. | Agent sees or calls tools beyond the approved list. | Do not imply tool allowlisting is provided automatically by GitHub; external behavior is [DATA NOT AVAILABLE]. |
| Filesystem MCP scope | Restrict filesystem tool to `/tmp/agent-work` only. | Attempt reads/writes to `/tmp/agent-work` and to paths outside it, such as `/home`, repo root, or `/etc`. | Any successful access outside `/tmp/agent-work`. | The article should describe this as a PoC guardrail, not as a verified sandbox guarantee unless independently tested. |
| Issue-reader MCP access | Make issue-reader read-only. | Attempt allowed read operation and blocked create/update/delete operations. | Agent can mutate issues or metadata. | GitHub issue API behavior, permission mapping, and SDK mediation are [DATA NOT AVAILABLE]. |
| Shell command execution | Require explicit human approval before every shell command. | Submit a harmless command and verify it pauses for approval before execution; reject another command and verify no execution occurs. | Command runs without approval, or rejected command still executes. | Do not claim built-in human-in-the-loop semantics unless confirmed from source documentation; mark as app-level control if implemented by the vendor. |
| Shell command scope | Start with shell disabled unless the user approves a specific command. | Try command chaining, redirects, environment reads, and file writes outside allowed paths. | Approval for one command enables broader shell access than intended. | Fine-grained shell mediation capabilities in Copilot sandboxes are [DATA NOT AVAILABLE]. |
| Cloud session startup | Disable cloud session by default; allow only when org policy explicitly permits it. | Test policy-denied and policy-allowed cases before session creation. | Cloud session starts while policy says denied or unset. | Specific Copilot sandbox/cloud-session policy hooks are [DATA NOT AVAILABLE]. Present as desired control logic, not confirmed product behavior. |
| Organization policy evaluation | Treat missing, stale, or unreadable policy as deny. | Simulate missing policy response, timeout, and conflicting policy values. | Runtime proceeds with permissive behavior when policy cannot be verified. | Do not invent GitHub org-policy APIs or enforcement semantics. |
| Identity and authorization | Bind agent actions to the authenticated app user and least-privilege credentials. | Verify which identity is used for repo reads, issue reads, shell approvals, and cloud session requests. | Actions occur under an unexpected or overprivileged identity. | Exact identity propagation through Copilot SDK/MCP/sandbox is [DATA NOT AVAILABLE]. |
| Audit trail | Log requested tool call, actor, approval decision, target path/resource, and outcome. | Review logs after approved, denied, and failed operations. | Missing records for tool calls, shell approvals, or policy decisions. | Avoid claiming GitHub provides these audit fields unless sourced; vendor may need to implement them. |
| Prompt-injection resistance | Treat repository and issue contents as untrusted input; system policy overrides repo text. | Place adversarial instructions in sample repo/issue content and verify the agent does not bypass controls. | Agent follows repo-provided instructions to call forbidden tools or skip approval
...[truncated]

Limitations

This API run is a bounded lab check. It is not a production benchmark, user study, or proof that an external product works in a real customer environment.

Additional Local Checks

Date: 2026-06-07
Environment: macOS local workspace, zsh, Python 3.12, npm available from the project environment.
Command: npm view @github/copilot-sdk version name dist-tags --json
Output:

{
  "version": "1.0.0",
  "name": "@github/copilot-sdk",
  "dist-tags": {
    "unstable": "0.2.1-unstable.0",
    "prerelease": "1.0.0-beta.11",
    "latest": "1.0.0"
  }
}

Operational Notes and Limits

The first OpenAI API attempt failed before request execution because the local Python TLS trust path could not verify the certificate chain: CERTIFICATE_VERIFY_FAILED.
Retrying the same safe prompt with SSL_CERT_FILE pointed at the installed certifi CA bundle completed successfully and wrote data/lab-runs/github-copilot-sdk-sandboxes-agent-runtime-poc-2026.openai.json.
Effloow Lab did not authenticate to GitHub Copilot, did not create a real Copilot SDK session, did not run Copilot CLI, and did not start a local or cloud Copilot sandbox in this workflow.
The article may describe the saved OpenAI API check as a synthetic governance prompt harness. It must not describe it as a Copilot benchmark, a successful Copilot SDK integration, or a hands-on Copilot sandbox test.

Github Copilot Sdk Sandboxes Agent Runtime Poc 2026

Effloow Lab OpenAI API Run: github-copilot-sdk-sandboxes-agent-runtime-poc-2026

Prompt Excerpt

Output

Limitations

Additional Local Checks

Operational Notes and Limits

Read the article