Openai Agents Sdk Guardrails Local Sandbox Poc 2026
Date: 2026-05-30 Track: sandbox-poc Slug: openai-agents-sdk-guardrails-local-sandbox-poc-2026
Goal
Verify whether OpenAI Agents SDK guardrail logic can be exercised locally without API credits, then document the boundary between:
- Guardrail function behavior that can be tested without model calls.
- Runner-level input tripwire behavior that can block before a model call.
- Full output-guardrail-on-model-response behavior, which still requires a model response and was not tested here.
No OpenAI API key, production credential, hosted sandbox, or paid model call was used.
Environment
Host: macOS Darwin 24.6.0 arm64
Working directory: /tmp/effloow-openai-agents-guardrails-poc
Python: Python 3.12.8
openai-agents: 0.17.4
openai: 2.38.0
Date: 2026-05-30
Commands and Outputs
1. Prepare isolated sandbox
rm -rf /tmp/effloow-openai-agents-guardrails-poc
mkdir -p /tmp/effloow-openai-agents-guardrails-poc
python3 -m venv /tmp/effloow-openai-agents-guardrails-poc/.venv
/tmp/effloow-openai-agents-guardrails-poc/.venv/bin/python -V
Output:
Python 3.12.8
2. Install current SDK package
/tmp/effloow-openai-agents-guardrails-poc/.venv/bin/python -m pip install --upgrade pip
/tmp/effloow-openai-agents-guardrails-poc/.venv/bin/python -m pip install openai-agents
Relevant output:
Successfully installed ... openai-2.38.0 openai-agents-0.17.4 ...
Version check:
/tmp/effloow-openai-agents-guardrails-poc/.venv/bin/python -m pip show openai-agents openai | sed -n '1,120p'
Relevant output:
Name: openai-agents
Version: 0.17.4
Summary: OpenAI Agents SDK
Home-page: https://openai.github.io/openai-agents-python/
Requires: griffelib, mcp, openai, pydantic, requests, types-requests, typing-extensions, websockets
---
Name: openai
Version: 2.38.0
Summary: The official Python library for the openai API
Home-page: https://github.com/openai/openai-python
3. Build local guardrail PoC
File: /tmp/effloow-openai-agents-guardrails-poc/guardrail_poc.py
import asyncio
import json
import os
import re
import agents
from agents import (
Agent,
GuardrailFunctionOutput,
InputGuardrailTripwireTriggered,
Runner,
RunContextWrapper,
input_guardrail,
output_guardrail,
set_tracing_disabled,
)
PII_PATTERN = re.compile(r"[\w.+-]+@[\w.-]+\.\w+|sk-[A-Za-z0-9_-]{8,}")
@input_guardrail(run_in_parallel=False)
async def reject_pii_input(ctx, agent, user_input):
text = user_input if isinstance(user_input, str) else json.dumps(user_input)
found = bool(PII_PATTERN.search(text))
return GuardrailFunctionOutput(
output_info={"contains_pii": found, "agent": agent.name},
tripwire_triggered=found,
)
@output_guardrail
async def reject_unredacted_output(ctx, agent, output):
text = str(output)
found = bool(PII_PATTERN.search(text))
return GuardrailFunctionOutput(
output_info={"contains_unredacted_pii": found, "agent": agent.name},
tripwire_triggered=found,
)
async def manual_guardrail_checks(agent):
ctx = RunContextWrapper(context=None)
safe_input = await reject_pii_input.run(agent, "Summarize the public changelog.", ctx)
unsafe_input = await reject_pii_input.run(agent, "Email jane@example.com the report.", ctx)
safe_output = await reject_unredacted_output.run(ctx, agent, "The report is ready.")
unsafe_output = await reject_unredacted_output.run(ctx, agent, "Send it to jane@example.com.")
return {
"safe_input_tripwire": safe_input.output.tripwire_triggered,
"unsafe_input_tripwire": unsafe_input.output.tripwire_triggered,
"safe_output_tripwire": safe_output.output.tripwire_triggered,
"unsafe_output_tripwire": unsafe_output.output.tripwire_triggered,
}
def runner_blocking_check(agent):
try:
Runner.run_sync(agent, "Send jane@example.com a copy of the report.")
except InputGuardrailTripwireTriggered as exc:
return {
"runner_blocked_before_model": True,
"exception": type(exc).__name__,
"output_info": exc.guardrail_result.output.output_info,
}
return {"runner_blocked_before_model": False}
def main():
os.environ.pop("OPENAI_API_KEY", None)
set_tracing_disabled(True)
agent = Agent(
name="policy-demo",
instructions="Return concise compliance summaries.",
input_guardrails=[reject_pii_input],
output_guardrails=[reject_unredacted_output],
)
payload = {
"agents_version": agents.__version__,
"openai_api_key_present": "OPENAI_API_KEY" in os.environ,
"manual_checks": asyncio.run(manual_guardrail_checks(agent)),
"runner_check": runner_blocking_check(agent),
}
print(json.dumps(payload, indent=2, sort_keys=True))
if __name__ == "__main__":
main()
4. Run local PoC without API key
/tmp/effloow-openai-agents-guardrails-poc/.venv/bin/python /tmp/effloow-openai-agents-guardrails-poc/guardrail_poc.py
printf 'exit_code=%s\n' $?
Output:
{
"agents_version": "0.17.4",
"manual_checks": {
"safe_input_tripwire": false,
"safe_output_tripwire": false,
"unsafe_input_tripwire": true,
"unsafe_output_tripwire": true
},
"openai_api_key_present": false,
"runner_check": {
"exception": "InputGuardrailTripwireTriggered",
"output_info": {
"agent": "policy-demo",
"contains_pii": true
},
"runner_blocked_before_model": true
}
}
exit_code=0
What Worked
openai-agents==0.17.4installed successfully in an isolated virtualenv.- Input and output guardrail functions returned
GuardrailFunctionOutputobjects with expectedtripwire_triggeredvalues. @input_guardrail(run_in_parallel=False)worked as a pre-model policy gate in the Runner path.Runner.run_sync()raisedInputGuardrailTripwireTriggeredfor unsafe input whileOPENAI_API_KEYwas absent.set_tracing_disabled(True)avoided trace-export attempts during the credential-free sandbox.
What Failed or Was Not Tested
- The lab did not run a clean prompt through
Runner.run_sync()because that would require a model call. - The lab did not validate output guardrails on a real model response; it only exercised the output guardrail function directly.
- The lab did not test hosted sandbox agents, MCP servers, handoffs, streaming, tracing export, or production deployment behavior.
- The PII detector is a minimal regex demo, not a production PII classifier.
Limitations
This PoC proves that guardrail functions and blocking input tripwires can be tested locally without API credits. It does not prove end-to-end agent quality, model refusal quality, latency, pricing, hosted tracing, or false-positive rates. Production teams should add model-backed integration tests, structured-output validation, realistic PII fixtures, and trace review before treating guardrails as a release gate.
Read the article
This note supports the public article and records what was actually checked.