How to Attach a Secure Sandbox Shell to Your OpenAI Model in Minutes
Attach a secure sandboxed shell to your OpenAI model in minutes using ShellifyAI. This tutorial walks through OpenAI tool wiring, forwarding tool calls to Shellify’s execute endpoint, security best practices, and ready-to-run examples in JS and Python.
How to Attach a Secure Sandbox Shell to Your OpenAI Model in Minutes
Intro
Allowing an LLM to run shell commands unlocks powerful diagnostics and automation (filesystem inspection, running tests, building artifacts). But running arbitrary shell commands is dangerous unless you sandbox execution, set strict allow/deny rules, and capture audit logs.
ShellifyAI makes this integration fast and safe: define a local_shell tool in your OpenAI request, forward tool calls to ShellifyAI’s secure execute API, then return the results to the model. In minutes you get sandboxing, timeouts, streaming output, artifact capture, and session persistence — without writing your own container orchestration.
What you’ll learn
- How the OpenAI shell tool flow works (quick recap)
- How to wire ShellifyAI into that flow in minutes
- Security best practices and common patterns
- Ready-to-run examples (JavaScript + Python + Vercel AI SDK)
Why use ShellifyAI
ShellifyAI handles the hard parts so you don't have to build and maintain sandbox infrastructure:
- Secure sandboxed execution in ephemeral containers
- Session-based persistence for multi-step workflows
- Real-time stdout/stderr streaming for responsive UIs
- File artifact capture (uploads with signed URLs)
- Adapter support for different agent SDKs and adapters
Step 0 — Prerequisites
- Create a ShellifyAI project and get your SHELLIFYAI_API_KEY from the Shellify console.
- Have an OpenAI integration that can define a function/tool (Responses API or Agents SDK where applicable).
Set environment variables
bash1export SHELLIFYAI_API_KEY=your_api_key
Quick recap — how OpenAI’s shell tool works
OpenAI’s shell tool lets models propose commands via the Responses API in the form of function/tool calls. Your integration executes those commands and returns structured outputs with stdout, stderr and an outcome (exit or timeout).
Key points:
- The model outputs a function/tool call with arguments (e.g., { command: "..." }).
- Execute commands in a sandbox and return outputs including stdout, stderr and an outcome ({type: "exit", exit_code: N} or {type: "timeout"}).
- If the model includes a max_output_length or similar parameter, make sure you honor/copy it back in your response to avoid errors.
How ShellifyAI fits — the simple flow
- Define a local_shell tool in your OpenAI request.
- When the model calls the tool, forward the command to ShellifyAI’s API: POST https://shellifyai.com/v1/execute with your SHELLIFYAI_API_KEY in the x-api-key header.
- ShellifyAI executes in a sandbox and returns structured events (stdout/stderr logs, artifacts, status updates, and completed results).
- Map Shellify’s response back to the model via the Responses API (submit tool outputs or function outputs depending on your setup).
JavaScript (OpenAI client)
This example uses the direct POST to the execute endpoint. Note the environment variable name and the single execute endpoint — the API key encodes your project, so there's no separate projectId parameter.
typescript1import OpenAI from "openai";23const client = new OpenAI();45const tools = [{6 type: "function",7 name: "local_shell",8 description: "Execute shell commands in a secure sandbox",9 parameters: {10 type: "object",11 properties: {12 command: { type: "string", description: "Shell command to execute" }13 },14 required: ["command"]15 }16}];1718const response = await client.responses.create({19 model: "gpt-5.1",20 input: "Create a Python file that prints Hello World and run it",21 tools22});2324for (const item of response.output) {25 if (item.type === "function_call" && item.name === "local_shell") {26 const args = JSON.parse(item.arguments || "{}");2728 const result = await fetch("https://shellifyai.com/v1/execute", {29 method: "POST",30 headers: {31 "Content-Type": "application/json",32 "x-api-key": process.env.SHELLIFYAI_API_KEY!33 },34 body: JSON.stringify({35 adapterType: "local_shell",36 tool: "local_shell",37 payload: {38 command: args.command,39 // optional: sessionId, timeoutMs, workingDirectory, env, systemMessage40 sessionId: args.sessionId,41 timeoutMs: args.timeoutMs,42 }43 })44 }).then(r => r.json());4546 await client.responses.submitToolOutputs(response.id, {47 tool_outputs: [{ tool_call_id: item.call_id, output: JSON.stringify(result) }]48 });49 }50}
Python (OpenAI client)
python1from openai import OpenAI2import requests3import json4import os56client = OpenAI()78tools = [{9 "type": "function",10 "name": "local_shell",11 "description": "Execute shell commands in a secure sandbox",12 "parameters": {13 "type": "object",14 "properties": {15 "command": {"type": "string", "description": "Shell command to execute"}16 },17 "required": ["command"]18 }19}]2021response = client.responses.create(22 model="gpt-5.1",23 input="Create a Python file that prints Hello World and run it",24 tools=tools25)2627for item in response.output:28 if item.type == "function_call" and item.name == "local_shell":29 args = json.loads(item.arguments or "{}")30 result = requests.post(31 "https://shellifyai.com/v1/execute",32 headers={33 "Content-Type": "application/json",34 "x-api-key": os.environ["SHELLIFYAI_API_KEY"]35 },36 json={37 "adapterType": "local_shell",38 "tool": "local_shell",39 "payload": {40 "command": args.get("command"),41 "sessionId": args.get("sessionId"),42 "timeoutMs": args.get("timeoutMs"),43 }44 }45 ).json()4647 client.responses.submit_tool_outputs(48 response_id=response.id,49 tool_outputs=[{"tool_call_id": item.call_id, "output": json.dumps(result)}]50 )
Vercel AI SDK
If you use the Vercel AI SDK, use the Shellify helper from @shellifyai/shell-tool. The helper handles execute calls for you; provide your SHELLIFYAI_API_KEY and the SDK takes care of forwarding commands and streaming responses.
Handling streaming & artifacts
For responsive UI and long-running commands, enable streaming by setting the Accept header to application/jsonl or by adding ?stream=true to the execute POST. Shellify emits structured events as JSON lines: meta, status, log (stdout/stderr), artifact (filename + signed URL), and completed status. Use these events to progressively render output and surface artifacts (e.g., test reports, built binaries) to users without waiting for the full job to finish.
When capturing artifacts, Shellify uploads files and returns signed URLs in artifact events. Store those URLs or surface them in the chat UI so users can download outputs produced by the shell session.
Security best practices
Shellify provides sandboxing by default, but you should still apply additional controls depending on your risk profile:
- Allow/deny lists: Block risky commands (e.g., rm -rf, curl to external hosts) at the integration layer when possible.
- Timeouts: Respect the timeoutMs provided to Shellify and configure conservative defaults for unknown commands.
- Session scope: Use sessionId only when you need filesystem persistence across steps; ephemeral sessions reduce blast radius.
- Network restrictions: Disable network access in sandboxes unless explicitly necessary.
- Least privilege: Run commands as a non-root user inside the container and restrict filesystem mounts.
- Audit logging: Persist command, stdout/stderr, and artifact metadata for traceability and debugging.
- System messages: Shellify appends security policies to systemMessage, so custom overrides cannot bypass sandbox rules.
Mapping Shellify responses to OpenAI
Shellify returns a structured result containing events and a summary. OpenAI expects a tool output with stdout, stderr, and an outcome ({ "type": "exit", "exit_code": N } or { "type": "timeout" }). Map the final aggregated stdout/stderr and set the outcome based on the exit code or timeout events. Also copy any max output length values the model specified back into your returned tool output to avoid validation errors.
Advanced tips & troubleshooting
- adapterType: only override it if you need a non-default adapter for specific behavior—otherwise omit it and let the project default apply.
- Long outputs: rely on streaming to avoid very large payloads and to keep user experience snappy.
- Non-interactive commands: ensure tools the model calls are non-interactive (no password prompts or full-screen editors).
- Exit codes: Capture non-zero exit codes and still return stdout/stderr so the model can reason about warnings vs fatal errors.
Wrap-up — minutes, not days
With ShellifyAI you don’t need to build sandbox infrastructure or artifact handling. Define a local_shell tool, forward model tool calls to POST https://shellifyai.com/v1/execute with your SHELLIFYAI_API_KEY, and return the structured result to the model. Follow the security checklist and you can safely attach a sandboxed shell to your model in minutes.
Next steps
- Create a Shellify project and copy your credentials into SHELLIFYAI_API_KEY.
- Try the JavaScript quick-start snippet above.
- If you’re using Vercel, install @shellifyai/shell-tool and use the shellify helper for automatic execution.
Resources
For full API reference and developer documentation, see https://shellifyai.com/docs
Ready to get started?
Start building powerful AI applications with secure, scalable execution environments.