NVIDIA Open Source AI Security

🛡️ NVIDIA OpenShell: The Safe Runtime for Autonomous AI Agents

AI agents can now run code, spawn subagents, and operate for hours unattended. The environment to actually trust them has been missing — until now.

AI Agent

March 20, 2026 · 14 min read

📺 Watch the video version: ThinkSmart.Life/youtube

🎧

Listen to this article

The OpenShell Moment

At NVIDIA GTC 2026, Jensen Huang made a declaration that landed differently than his usual hyperbole: every company needs an agent runtime strategy. Not a model strategy. Not a GPU strategy. A runtime strategy — because the agents that now run unattended across your infrastructure are an entirely different threat surface than the chatbots that came before.

The timing was deliberate. Alongside the NemoClaw announcement, NVIDIA open-sourced OpenShell under Apache 2.0 — a dedicated runtime designed to give autonomous AI agents the execution environment they've always needed but never had. Not a prompt wrapper. Not a behavioral system prompt. An actual infrastructure-level security boundary between agents and everything they shouldn't touch.

This piece breaks down what OpenShell is, how it works architecturally, and why its core design decision — out-of-process policy enforcement — represents a meaningful shift in how the industry should think about deploying AI agents at scale.

Policy Domains

Apache 2.0

License

K3s

Under the Hood

Supported Agents

Why the Old Model Fails

The conventional approach to AI agent safety is to build guardrails inside the agent. System prompts instruct the model to be careful. Tool definitions constrain what functions get called. Some frameworks add an approval layer where the agent pauses and asks before destructive operations. This approach made sense for stateless chatbots.

It breaks down for autonomous agents.

NVIDIA's team put it bluntly in the OpenShell announcement: "The critical failure mode is guardrails living inside the same process they're supposed to be guarding." Consider what a modern long-running agent actually does: it remembers context across sessions, spawns subagents to act independently, writes its own code to learn new skills mid-task, installs packages, and keeps executing long after you've closed your laptop. A stateless chatbot has no meaningful attack surface. An agent with persistent shell access, live credentials, the ability to rewrite its own tooling, and six hours of accumulated context running against your internal APIs is a fundamentally different threat model.

The attack vectors are concrete:

Prompt injection via third-party content — the agent reads a malicious webpage or email, the injected instruction overrides its original task.
Credential leakage — API keys stored in environment variables or config files become accessible to any code the agent installs.
Subagent permission inheritance — when the agent spawns a child agent to parallelize work, that child inherits parent-level permissions it was never meant to have.
Unreviewed binary execution — agents that install skills or packages at runtime are running code nobody reviewed, with full filesystem access.
Unconstrained egress — nothing stops a compromised agent from exfiltrating data to an external endpoint.

Internal guardrails can't reliably prevent any of these. A model that's been prompt-injected will use the same behavioral rules it uses for legitimate tasks — just toward the attacker's goals. The environment needed a different approach.

⚠️ The Core Problem If an agent can override its own safety rules — because those rules live in the same process as the agent — you don't have safety. You have a hope.

Architecture Overview

OpenShell's core architectural decision is out-of-process policy enforcement. Instead of relying on behavioral constraints inside the model, it enforces constraints on the environment the agent runs in. The agent cannot override them — even if compromised — because they execute at the infrastructure layer, not the application layer.

NVIDIA describes it as "the browser tab model applied to agents": sessions are isolated, and permissions are verified by the runtime before any action executes. Tools like Claude Code and Cursor ship with valuable internal guardrails, but those protections live inside the agent. OpenShell wraps those harnesses, moving the ultimate control point entirely outside the agent's reach.

Under the hood, all components run as a K3s Kubernetes cluster inside a single Docker container — no separate K8s install required. The openshell gateway commands handle provisioning the container and cluster automatically.

┌─────────────────────────────────────────────────────────┐
│                    HOST MACHINE                          │
│                                                         │
│  ┌──────────────────────────────────────────────────┐  │
│  │              OPENSHELL GATEWAY                    │  │
│  │         (Control Plane + Auth Boundary)           │  │
│  │                                                   │  │
│  │  ┌─────────────┐    ┌─────────────────────────┐  │  │
│  │  │   SANDBOX   │    │      POLICY ENGINE       │  │  │
│  │  │  (K3s/Docker│    │  Filesystem · Network   │  │  │
│  │  │  Isolated)  │───▶│  Process · Inference    │  │  │
│  │  │             │    │                         │  │  │
│  │  │  [Agent]    │    └─────────────────────────┘  │  │
│  │  │  Claude Code│              │                   │  │
│  │  │  OpenClaw   │              ▼                   │  │
│  │  │  Codex etc. │    ┌─────────────────────────┐  │  │
│  │  └─────────────┘    │     PRIVACY ROUTER       │  │  │
│  │                     │  Local ◀──▶ Frontier     │  │  │
│  │                     │  (Claude, GPT, Ollama)   │  │  │
│  │                     └─────────────────────────┘  │  │
│  └──────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────┘

The four primary components each have a distinct role:

Component	Role
Gateway	Control-plane API that coordinates sandbox lifecycle and acts as the authentication boundary
Sandbox	Isolated runtime with container supervision and policy-enforced egress routing
Policy Engine	Enforces filesystem, network, and process constraints from application layer down to kernel
Privacy Router	Privacy-aware LLM routing that keeps sensitive context on sandbox compute

The Sandbox

The sandbox is the runtime environment in which agents actually execute. It's designed specifically for long-running, self-evolving agents — not generic container isolation. This distinction matters because the requirements are different:

Skill development and verification — agents that learn new capabilities at runtime need a place to do that safely, with verification before new code runs
Programmable system and network isolation — the boundaries can be configured per-agent and updated without restart
Isolated execution environments that agents can break — if an agent corrupts its own environment, it crashes the sandbox, not the host
A full audit trail — every allow and deny decision is logged, with policy updates tracked against developer approvals

Creating a sandbox takes one command:

openshell sandbox create -- claude
# or: opencode, codex, ollama, openclaw

A gateway is created automatically on first use. For remote deployment — say, on an NVIDIA DGX Spark — you pass --remote user@host to the create command and the same local workflow applies. Every sandbox starts with minimal outbound access: the default posture is deny.

The sandbox container ships with a useful default toolset that agents need to function:

Category	Tools
Agent	claude, opencode, codex
Language	python 3.13, node 22
Developer	gh, git, vim, nano
Networking	ping, dig, nslookup, nc, traceroute, netstat

The Policy Engine

The policy engine is where OpenShell's security model gets precise. It enforces constraints across four distinct domains, each operating at a different layer of the stack:

Layer	What It Protects	When It Applies
Filesystem	Prevents reads/writes outside allowed paths	Locked at sandbox creation
Network	Blocks unauthorized outbound connections	Hot-reloadable at runtime
Process	Blocks privilege escalation and dangerous syscalls	Locked at sandbox creation
Inference	Reroutes model API calls to controlled backends	Hot-reloadable at runtime

The split between locked-at-creation and hot-reloadable is deliberate. Filesystem and process constraints define the fundamental security boundary of a sandbox — changing them at runtime would undermine the isolation guarantee. Network and inference policies, by contrast, are operational: you need to be able to open new API endpoints or switch inference backends as work progresses, without tearing down a running agent session.

Declarative YAML Policies

All policies are expressed as declarative YAML files. A network policy that grants read-only GitHub API access looks like this:

network:
  egress:
    - destination: api.github.com
      methods: [GET, HEAD]
      paths:
        - /repos/*
        - /zen
    - destination: pypi.org
      methods: [GET]

The policy engine enforces at the HTTP method and path level — meaning an agent can call GET /repos/octocat/hello-world but get a policy_denied error if it tries POST /repos/octocat/hello-world/issues. This isn't firewall-level IP filtering; it's application-layer enforcement that understands the semantics of HTTP verbs.

# Sandbox trying to POST without permission
sandbox$ curl -sS -X POST https://api.github.com/repos/octocat/hello-world/issues \
  -d '{"title":"oops"}'
{"error":"policy_denied","detail":"POST /repos/octocat/hello-world/issues not permitted by policy"}

When an agent hits a constraint it needs to exceed, it doesn't just fail silently — it can reason about the roadblock and propose a policy update. That proposal surfaces to the human operator, who retains final approval. This creates a natural feedback loop: the agent's capabilities can expand incrementally as trust is established, with a full paper trail of every permission grant.

🔑 Key Design Principle Every outbound connection is intercepted by the policy engine, which does one of three things: allow (matches a policy block), route for inference (strips caller credentials, injects backend credentials, forwards), or deny (blocks and logs). No exceptions.

The Privacy Router

The privacy router addresses a problem that's easy to overlook when you're focused on sandboxing: where does inference actually happen, and who can see the context you're sending?

Autonomous agents accumulate significant context over time — user data, internal documents, code with proprietary logic, API responses. When that context gets sent to a frontier model like Claude or GPT, it leaves your infrastructure. For many organizations, that's a compliance problem. For others, it's simply a cost problem: frontier models charge per token, and long-running agents with large contexts get expensive fast.

OpenShell's privacy router sits between the agent and all inference backends. It makes routing decisions based on your defined privacy and cost policy — not the agent's judgment. The default posture is to keep sensitive context on local compute using open models. Frontier model access requires an explicit policy allowance.

inference:
  router:
    default: local        # Use Ollama by default
    rules:
      - condition: context_sensitivity == "high"
        route: local
      - condition: task_type == "code_synthesis"
        route: frontier   # Allow Claude/GPT for code tasks

The router's credential management is particularly thoughtful: it strips the agent's own API credentials before forwarding to a backend, then injects the controlled backend credentials. The agent never needs its own API keys for inference — and compromising the agent doesn't yield usable credentials for the frontier APIs.

This is also what enables OpenShell to be model-agnostic by design. The privacy router abstracts the inference layer entirely. Whether you're running Ollama locally, hitting Claude via Anthropic's API, or routing to OpenAI — the agent code is identical. The routing is infrastructure configuration, not application code.

GPU Passthrough

For teams running local inference — whether for privacy reasons, cost reasons, or latency requirements — OpenShell supports GPU passthrough into sandboxes. The flag is straightforward:

openshell sandbox create --gpu --from gpu-sandbox -- claude

The CLI auto-bootstraps a GPU-enabled gateway on first use. GPU intent is also inferred automatically for community sandbox images with "gpu" in the name.

The requirements are honest about current limitations: NVIDIA drivers and the NVIDIA Container Toolkit must be installed on the host. The default base sandbox image doesn't include GPU drivers — you need a custom sandbox image built for your specific workload. NVIDIA provides a Bring Your Own Container (BYOC) example in the OpenShell repository for this use case.

The integration with NVIDIA DGX Spark is worth noting separately. The --remote spark flag allows you to point a local OpenShell CLI at a remote DGX Spark, and the agent runs there — with GPU access — while you interact with it from your laptop. This is the architecture NemoClaw uses for its one-command deployment: openshell sandbox create --remote spark --from openclaw.

🖥️ Local Inference Use Case GPU passthrough enables running Ollama inside a sandboxed environment on your own hardware. Combined with the privacy router routing all inference locally by default, you get: agent execution that never sends context to external endpoints.

Agent Support Matrix

OpenShell ships with out-of-the-box support for the five most commonly used coding and autonomous agents:

Agent	Source	Notes
Claude Code	Base sandbox	Works out of the box. Provider uses `ANTHROPIC_API_KEY`.
OpenCode	Base sandbox	Works out of the box. Provider uses `OPENAI_API_KEY` or `OPENROUTER_API_KEY`.
Codex	Base sandbox	Works out of the box. Provider uses `OPENAI_API_KEY`.
OpenClaw	Community sandbox	Launch with `openshell sandbox create --from openclaw`.
Ollama	Community sandbox	Launch with `openshell sandbox create --from ollama`.

The agent code itself runs unmodified inside OpenShell. This is a critical design constraint: the security model can't require changes to the agent harness, because agent harnesses change rapidly and because you want to be able to sandbox third-party agents you don't control. OpenShell wraps the environment, not the application.

Community sandbox images extend the base with agent-specific tooling. OpenClaw, for example, gets the OpenClaw binary and its configuration scaffolding. Ollama sandboxes include the Ollama server and GPU-aware startup logic. The community repository (NVIDIA/OpenShell-Community) hosts these extended images.

Hot-Reloadable Policies

One of the more practically important features is the ability to update network and inference policies on a running sandbox without restarting the agent. In real-world agent workflows, the set of external resources an agent needs access to evolves as the task progresses. Requiring a restart to grant access to a new API endpoint would be disruptive to long-running sessions.

The workflow looks like this:

# Agent is running, hits a blocked endpoint
sandbox$ curl -sS https://api.github.com/zen
curl: (56) Received HTTP code 403 from proxy after CONNECT

# On the host, operator applies an updated policy
$ openshell policy set my-sandbox \
    --policy github-readonly.yaml \
    --wait

# Agent reconnects without restart
$ openshell sandbox connect my-sandbox
sandbox$ curl -sS https://api.github.com/zen
Anything added dilutes everything else.

The --wait flag blocks until the policy has propagated to the running sandbox. This ensures there's no window where the old policy is still enforced while the new one is loading.

Policy updates are also the mechanism by which agents can expand their own capabilities — with human approval. When an agent determines it needs access to a new resource, it surfaces a proposed policy change. The operator reviews, applies it with openshell policy set, and the agent continues without interruption. Every policy change is logged in the audit trail.

Credential Management

OpenShell manages credentials as providers — named credential bundles that are injected into sandboxes at creation. The CLI auto-discovers credentials for recognized agents from your shell environment. For Claude Code, it reads ANTHROPIC_API_KEY. For OpenCode, it reads OPENAI_API_KEY or OPENROUTER_API_KEY. You can also create providers explicitly:

openshell provider create \
    --type anthropic \
    --from-existing   # reads from current env vars

The critical property: credentials never leak into the sandbox filesystem. They are injected as environment variables at runtime, inside the container, and are not written to disk at any point. If an agent somehow escapes its sandbox (a future attack that OpenShell's defense-in-depth aims to prevent), it does not carry credentials with it.

For the privacy router use case, the managed credential system enables a clean separation: the agent's own API keys can be different from — or entirely absent compared to — the credentials used for actual inference. The router holds the production inference credentials and injects them only when routing requests that meet policy criteria.

Getting Started

The installation is deliberately minimal. Two options:

# Option 1: Binary (recommended)
curl -LsSf https://raw.githubusercontent.com/NVIDIA/OpenShell/main/install.sh | sh

# Option 2: PyPI (requires uv)
uv tool install -U openshell

Prerequisites: Docker Desktop (or a Docker daemon) must be running. For GPU support, NVIDIA drivers and the NVIDIA Container Toolkit are also required.

The quickstart from there is three commands:

# Create a sandbox with Claude Code
openshell sandbox create -- claude

# Apply a policy to open GitHub read access
openshell policy set my-sandbox --policy github-readonly.yaml --wait

# List running sandboxes
openshell sandbox list

The full documentation is at docs.nvidia.com/openshell, including the sandbox policy quickstart walkthrough and BYOC examples for custom sandbox images.

OpenShell vs Alternatives

It's worth being precise about what OpenShell is and isn't, relative to adjacent tools in the ecosystem.

vs. E2B / Daytona / Morph (Cloud Sandboxes)

Cloud sandbox providers like E2B and Daytona offer isolated execution environments for agents. The difference is architectural: those are hosted sandboxes — your agent code runs in their infrastructure. OpenShell is a self-hosted runtime you deploy on your own infrastructure, with policy enforcement under your control. For teams with strict data residency requirements, this is the meaningful distinction.

vs. Docker / Kubernetes (Container Isolation)

OpenShell uses containers and K3s under the hood, but container isolation alone doesn't give you the agent-specific primitives: hot-reloadable network policies at the HTTP method level, credential injection without filesystem exposure, privacy-aware inference routing, or the agent-proposal mechanism for policy updates. OpenShell is an agent runtime built on top of container infrastructure, not a substitute for thinking about it.

vs. Firecracker / gVisor (Micro-VMs)

Micro-VM approaches provide stronger isolation guarantees than containers, at the cost of performance overhead and operational complexity. OpenShell's layered defense-in-depth (container isolation + kernel-level process policy + application-layer network enforcement) targets the practical threat model for AI agents without the overhead. Alpha software disclaimer applies: the current implementation may evolve as the threat landscape clarifies.

vs. Behavioral Guardrails (System Prompts + Approval Flows)

This is the sharpest comparison. Behavioral guardrails are not redundant with OpenShell — they address different layers. A well-crafted system prompt reduces the probability of an agent doing something unintended. OpenShell's policy engine prevents that action from completing even if the agent tries. Both belong in a complete agent security posture; neither is sufficient alone.

Approach	Enforcement Layer	Bypass-Resistant	Self-Hosted
System Prompts	Model behavior	No	—
E2B / Daytona	Cloud container	Partial	No
Raw Docker	Container	Partial	Yes
OpenShell	Infra + App layer	Yes	Yes

NemoClaw Integration

OpenShell is part of a broader NVIDIA initiative announced at GTC 2026. NemoClaw is an open-source stack that bundles OpenShell with NVIDIA's Nemotron models and the OpenClaw agent harness into a single deployable package. The goal: simplify running always-on, self-evolving agents with a single command.

# Full NemoClaw on DGX Spark — one command
openshell sandbox create --remote spark --from openclaw

NemoClaw represents NVIDIA's opinionated take on the "full agent stack": open models (Nemotron) + open runtime (OpenShell) + open agent framework (OpenClaw). The combination targets enterprises that want to run powerful agents without their data leaving their infrastructure — particularly relevant for healthcare, finance, and government deployments where data residency is non-negotiable.

The NVIDIA Agent Toolkit is the broader umbrella: models, tools, evaluation frameworks, and runtimes for building, testing, and optimizing long-running agents. OpenShell sits at the runtime layer of that stack, providing the security substrate on which everything else operates.

Limitations & Open Questions

OpenShell is labeled alpha software. NVIDIA is clear about current limitations:

Single-player mode only — one developer, one environment, one gateway. Multi-tenant enterprise deployments are on the roadmap but not yet implemented. Organizations thinking about deploying this for multiple teams should plan to wait for a more mature version.
Custom GPU images required — GPU passthrough works, but the default base sandbox image doesn't include GPU drivers. Building a custom image is a meaningful setup cost for teams without container expertise.
Community sandbox coverage — the base image supports Claude Code, OpenCode, and Codex. OpenClaw and Ollama require community images. Agents beyond these five need custom sandbox construction.
Policy complexity at scale — YAML-based policies are human-readable and correct for small deployments. For organizations with dozens of agent types and hundreds of external services, the policy management surface will need additional tooling (policy libraries, inheritance, automated generation).
Audit log tooling — the audit trail is generated but there's limited tooling for querying, alerting on, or visualizing it. This is expected at alpha stage but is a gap for compliance-focused buyers.

⚠️ Alpha Caveat NVIDIA's own documentation calls OpenShell "proof-of-life." The architecture is sound and the core primitives work, but expect rough edges in operational tooling, documentation gaps, and breaking changes as the project matures. Bring your agent — and your patience.

The open questions worth watching:

How will multi-tenant isolation be implemented? K3s namespaces, separate clusters, or something else?
Will the policy engine support semantic content inspection (not just network destinations and HTTP methods)?
How does the audit trail integrate with enterprise SIEM systems?
What's the performance overhead of L7 policy enforcement on high-throughput agent workloads?

None of these are blockers for individual developers or small teams experimenting today. They are relevant for enterprise procurement decisions.

The Bottom Line

The timing of OpenShell's release reflects where the agent ecosystem actually is. Autonomous agents that run for hours, install packages, spawn subagents, and hold live credentials are no longer theoretical — they're in production at forward-leaning engineering organizations right now. The security infrastructure to run them safely has been the missing piece.

OpenShell's core contribution is moving the control point outside the agent. Not better prompts. Not tighter approval flows. A genuine infrastructure boundary that the agent cannot reason around, prompt-inject through, or override by rewriting its own code. That's architecturally new in this ecosystem.

Whether NVIDIA's specific implementation becomes the dominant runtime or gets eclipsed by something faster-moving from the open-source community is an open question. What's not open: the industry needed something like this, and "out-of-process policy enforcement" is the right architectural frame for building it.

The six to twelve months after GTC 2026 will be decisive. Every organization deploying long-running agents in that window is making an infrastructure decision that will be expensive to change later. OpenShell is, at minimum, the clearest thinking on the problem that's been published under an open license. That alone makes it worth understanding deeply — even if you ultimately build something else.

References

NVIDIA Technical Blog — Run Autonomous, Self-Evolving Agents More Safely with NVIDIA OpenShell — Primary source for architecture details, design rationale, and the NemoClaw integration announcement. ↗ link
NVIDIA/OpenShell GitHub Repository — Source code, README, installation instructions, agent support matrix, and policy examples. ↗ link
NVIDIA OpenShell Documentation — Official docs covering sandbox creation, policy configuration, GPU support, and BYOC guide. ↗ link
NVIDIA News — NemoClaw Announcement at GTC 2026 — Jensen Huang's announcement of the full NemoClaw stack incorporating OpenShell and NVIDIA Nemotron models. ↗ link
NVIDIA/OpenShell-Community GitHub — Community sandbox images for OpenClaw, Ollama, and extended agent configurations. ↗ link
Cisco Blogs — Securing Enterprise Agents with NVIDIA OpenShell and Cisco AI Defense — Enterprise security perspective on OpenShell's infrastructure guardrails. ↗ link
MarkTechPost — NVIDIA AI Open-Sources OpenShell — Coverage of the Apache 2.0 release and technical overview. ↗ link
MindStudio — What Is OpenShell? NVIDIA's Open-Source Security Runtime — Independent analysis of the policy enforcement model and enterprise implications. ↗ link
K3s — Lightweight Kubernetes — The Kubernetes distribution that powers OpenShell's container orchestration layer. ↗ link
NVIDIA Agent Toolkit — GTC 2026 Announcement — The full deployment stack (models, tools, evaluation, runtimes) of which OpenShell is the security runtime layer. ↗ link