The Agentic Safety Shell: Deterministic Guardrails for AI Agents that touch Infrastructure

For network engineers, platform engineers, SREs, and engineering leaders deciding whether to give AI agents access to production.

The Problem With Giving AI Agents a Terminal

An AI agent that can only answer questions is a search engine with better grammar.

The useful AI agent — the one that actually reduces your operational burden — can run commands. It queries Azure APIs, checks route tables, lists NSG rules, deletes stale firewall entries. It acts.

And that is the problem.

Every AI agent that can act is one misread output, one stale context window, one confidently wrong inference away from running az network nsg rule delete on the wrong rule in the wrong resource group. Not maliciously. Just incorrectly. Language models are optimized for coherent output, not for operational correctness.

The Agentic Safety Shell puts a deterministic gate between the agent’s reasoning and the system acting on it.

How AI Agents Execute Commands Today

AI agent decides an action is needed
  → Formats a shell command or API call
      → Calls subprocess.run() or equivalent
          → Command executes immediately
          → Output returned to agent
              → No human saw the command
              → No risk classification
              → No record of reasoning vs. what actually ran
              → No distinction between "read" and "change"
  → Agent continues reasoning from the result

The agent’s judgment and the command execution are the same step. If the judgment is wrong — wrong hypothesis, stale context, misread output — the infrastructure change has already happened.

What the Safety Shell Does

A four-stage pipeline between the AI agent and the operating system. Every command passes through every stage.

  ┌──────────────────────────────────────────────────────────────────┐
  │  AI Agent produces:                                              │
  │    { "command": "...", "reasoning": "..." }                      │
  └────────────────────────────┬─────────────────────────────────────┘
                               │
                               ▼
  ┌──────────────────────────────────────────────────────────────────┐
  │  Stage 1 — Classification  (four-tier, no LLM)                   │
  │                                                                  │
  │  Tier 0: Forbidden list  →  BLOCKED unconditionally              │
  │  Tier 1: Allowlist       →  SAFE if known; RISKY if unknown      │
  │  Tier 2: Azure verb      →  list/show/get = SAFE                 │
  │                             create/delete/update = RISKY         │
  │  Tier 3: Patterns        →  sudo, chaining, shell evasion→ RISKY │
  │                                                                  │
  │  No language model. No inference. No prompt. Static logic.       │
  └──────┬──────────────────────────┬──────────────────────┬─────────┘
         │                          │                      │
       SAFE                       RISKY               BLOCKED (Tier 0)
         │                          │                      │
         │                          ▼                      ▼
         │          ┌───────────────────────────┐   Denied immediately.
         │          │  Stage 2 — Human Gate     │   No human prompt.
         │          │                           │   No override path.
         │          │  Pipeline halts.          │
         │          │  Human sees: command,     │
         │          │  reasoning, risk          │
         │          │  Approve / Deny / Modify  │
         │          │  Fail-closed on timeout   │
         │          └────────────┬──────────────┘
         │                  Approved
         │                       │
         ▼                       ▼
  ┌──────────────────────────────────────────────────────────────────┐
  │  Stage 3 — Execution                                             │
  │                                                                  │
  │  Subprocess runs with 120-second timeout                         │
  │  Exit code and output captured                                   │
  └────────────────────────────┬─────────────────────────────────────┘
                               │
                               ▼
  ┌──────────────────────────────────────────────────────────────────┐
  │  Stage 4 — Output Processing                                     │
  │                                                                  │
  │  Truncate:  output capped at 200 lines before agent sees it      │
  │  Redact:    API keys, tokens, passwords stripped by regex        │
  │  Log:       append-only JSONL audit entry written                │
  └────────────────────────────┬─────────────────────────────────────┘
                               │
                               ▼
              Structured response returned to AI agent:
              { status, action, output, output_metadata }

The LLM reasons about what to do. Deterministic logic decides whether it is safe to do it.

The Four Tiers

Tier 0 — Forbidden

Blocked unconditionally. No HITL prompt. No override path. The answer is always no.

rm -rf /              (root filesystem wipe)
mkfs.*                (disk format)
:(){ :|:& };:         (fork bomb)
shutdown / halt / reboot / poweroff
dd if=... of=/dev/sd* (block device overwrite)

If the agent proposes one of these, the shell logs the attempt, returns a denial, and does not surface the command to a human.

Tier 1 — Allowlist (Default Deny)

Anything not on the allowlist is RISKY. A command the safety layer has never seen will not silently execute — it stops for human confirmation.

The allowlist is explicit: ping, traceroute, dig, nslookup, ss, netstat, mtr, az (routed to Tier 2), and a small set of file-read operations for trusted forensic outputs. The ip command is also present — ip addr show and ip route show are SAFE; ip route add and ip addr del are RISKY. Unknown commands do not pass.

Tier 2 — Azure CLI Verb Classification

For az commands, the safety layer inspects the verb:

Verb	Classification	Examples
`list`, `show`, `get`, `check`, `exists`, `wait`	SAFE	`az vm list`, `az network nsg rule show`
`create`, `delete`, `update`, `set`, `add`, `remove`, `start`, `stop`, `restart`, `deallocate`	RISKY	`az network nsg rule delete`, `az vm restart`
Anything else	RISKY	—

A read-only investigation runs uninterrupted. The first state-change command stops for confirmation.

Tier 3 — Dangerous Pattern Detection

Catches what Tiers 0–2 miss:

sudo elevation: sudo az ...
Shell substitution: az ... $(cmd) or `cmd`
Any command chaining: |, &&, ||, ; — the entire compound command is RISKY
Destructive commands (rm, kill, chmod, chown) appearing anywhere in the argument list
Output redirection to system paths: ... > /etc/hosts
mv or tee targeting system paths

What It Looks Like at Runtime

A RISKY command stops the pipeline completely. Terminal output when Ghost Agent identifies a blocking NSG rule and wants to delete it:

────────────────────────────────────────────────────────────────────
⚠️  RISKY COMMAND — Human Approval Required
────────────────────────────────────────────────────────────────────
Command:   az network nsg rule delete \
             --resource-group nw-forensics-rg \
             --nsg-name tf-dest-vm-nsg \
             --name ghost-demo-block-8080

Reasoning: NSG rule 'ghost-demo-block-8080' (priority 300, Deny,
           TCP, port 8080) is blocking all traffic from 10.0.1.4
           to 10.0.1.5. This is the sole cause of the connectivity
           failure. Deleting it will restore the connection.

Risk:      Azure CLI state-change operation — permanently deletes
           a network security group rule.
────────────────────────────────────────────────────────────────────
[A]pprove  [D]eny  [M]odify  >

Nothing has run. The human sees exactly what the agent wants to do and why — in the agent’s own words, unchanged. One keypress to approve. Denying sends a refusal back to the agent, which adjusts its investigation. Modifying lets you change the command before it executes.

A SAFE command runs immediately with no gate:

[SAFE] az network nsg rule list --resource-group nw-forensics-rg \
       --nsg-name tf-dest-vm-nsg -o json
→ Executed. 2847 chars returned.

The Audit Trail

Every command — SAFE or RISKY, approved or denied — produces an entry in an append-only JSONL file. One file per session. The shell writes; the agent reads.

{"timestamp": "2026-02-20T19:41:12Z", "session_id": "ghost_20260220T194112", "command": "az network nsg rule list --resource-group nw-forensics-rg --nsg-name tf-dest-vm-nsg -o json", "reasoning": "Check existing NSG rules to determine which port is blocked", "tier": "SAFE", "hitl": null, "exit_code": 0, "output_chars": 2847, "truncated": false}
{"timestamp": "2026-02-20T19:43:07Z", "session_id": "ghost_20260220T194112", "command": "az network nsg rule delete --resource-group nw-forensics-rg --nsg-name tf-dest-vm-nsg --name ghost-demo-block-8080", "reasoning": "Identified blocking rule — deleting to restore connectivity", "tier": "RISKY", "hitl": {"decision": "approve", "modified": false}, "exit_code": 0, "output_chars": 0, "truncated": false}
{"timestamp": "2026-02-20T19:44:01Z", "session_id": "ghost_20260220T194112", "command": "rm -rf /", "reasoning": "Remove temporary files", "tier": "BLOCKED", "hitl": null, "exit_code": null, "output_chars": 0, "truncated": false}

Three commands. One read (SAFE, no HITL). One state-change (RISKY, human approved). One Tier 0 command (BLOCKED — not executed, no human consulted). After an incident, this file tells you exactly what the agent did, in order, with every human decision recorded.

Using It Standalone

The Safety Shell is a Python library with no opinion about which language model sits above it. Connect any agent — Gemini, GPT-4o, Claude, a fine-tuned local model — through the same execute() API.

from safe_exec_shell import SafeExecShell, HitlDecision

def terminal_hitl_callback(command, reasoning, risk_explanation, tier):
    """Called when a RISKY command needs human approval."""
    print(f"\n⚠️  RISKY: {command}")
    print(f"   Reason: {reasoning}")
    print(f"   Risk:   {risk_explanation}")
    choice = input("\n[A]pprove / [D]eny > ").strip().lower()
    return HitlDecision(action="approve" if choice == "a" else "deny")

shell = SafeExecShell(
    session_id="my-investigation",
    hitl_callback=terminal_hitl_callback,
    audit_dir="./audit"
)

# SAFE — runs immediately, no gate
result = shell.execute({
    "command": "az network nsg rule list --resource-group my-rg -o json",
    "reasoning": "List NSG rules to identify what is blocking port 8080"
})

# RISKY — stops for human approval
result = shell.execute({
    "command": "az vm restart --resource-group my-rg --name prod-vm",
    "reasoning": "Restart the VM to apply the kernel patch"
})

The HITL callback is yours to implement — terminal prompt, Slack bot, ticketing system, Jupyter widget. The shell calls it with the command and reasoning, blocks until you return a decision, then proceeds.

The standalone demo ships with the library:

cd agentic-safety-shell
export GEMINI_API_KEY="your-key"
uv run python examples/ai_safety_demo.py

The demo prompts Gemini 2.0 Flash to act as an aggressive automation engineer and attempt five commands — including risky ones. Each passes through the full classification pipeline; RISKY commands surface as terminal HITL prompts.

Inside Network Ghost Agent

In Network Ghost Agent, the Safety Shell is not optional — it is the only path to command execution. The agent has no subprocess import. Every command passes through shell.execute().

  Ghost Agent (Gemini 2.0 Flash)
        │
        │  shell.execute({ "command": ..., "reasoning": ... })
        │
        ▼
  Safety Shell
        │
        │  Classify → Gate (HITL for RISKY) → Execute → Process Output
        │
        ▼
  Shell audit JSONL  (append-only, one file per session)
        │
        ▼
  Structured response back to Ghost Agent
  { status, action, output, output_metadata }

The agent escalates through levels: Azure control-plane queries first (NSG rules, route tables, effective routes), then packet captures. Each escalation triggers RISKY commands — the Safety Shell surfaces them for confirmation before they run.

Terminal output when Ghost Agent has exhausted control-plane evidence and escalates to a packet capture:

[Turn 6] Ghost Agent → Safety Shell
────────────────────────────────────────────────────────────────────
⚠️  RISKY COMMAND — Human Approval Required
────────────────────────────────────────────────────────────────────
Command:   az network watcher packet-capture create \
             --resource-group nw-forensics-rg \
             --vm tf-source-vm \
             --name ghost_tf-source-vm_20260220T192155 \
             --storage-account nwlogs080613 \
             --time-limit 60 \
             --location eastus

Reasoning: NSG rules are clean. Route table shows no UDR overrides.
           Control-plane cannot explain the connectivity failure.
           Escalating to wire-level evidence: 60-second packet
           capture on tf-source-vm to observe whether SYNs are
           actually leaving the host.
────────────────────────────────────────────────────────────────────
[A]pprove  [D]eny  [M]odify  >

The agent explains why it wants the capture. You decide.

Agent output is also processed before it is read. Azure API responses can embed SAS tokens in blob URIs and access keys in connection strings. These are stripped by regex at the Shell boundary before the agent reads the output.

What It Does Not Do

Intentional omissions matter as much as features.

No LLM in the classification path. A second language model reasoning about whether the first model’s command is safe would be subject to the same prompt injection and hallucination risks. Static allowlists and regex patterns cannot be argued with or jailbroken.

No asynchronous execution for RISKY commands. The pipeline blocks on human approval. An agent that can queue unapproved state-changes in the background can outpace any human review. The gate is synchronous so it is a real gate.

No network surface. The shell is a Python library, not a remote API or sidecar. There is no network hop between the agent’s command and the safety classification.

No rewritten reasoning. The HITL prompt shows the agent’s own stated reasoning, verbatim. The shell does not summarize or reinterpret it. What you read is what the agent said.

What It Takes to Run

Python 3.12+ with uv
No external services, no API keys, no cloud dependencies for the shell itself

# Clone the full repo
git clone https://github.com/ranga-sampath/agentic-network-tools

# Standalone demo (bring your own Gemini key)
cd agentic-safety-shell
export GEMINI_API_KEY="your-key"
uv run python examples/ai_safety_demo.py

# Or run Network Ghost Agent (Safety Shell built in)
cd network-ghost-agent
uv run --python 3.12 python ghost_agent.py --resource-group <your-rg>

The Guarantee

The Safety Shell does not make AI agents infallible. It makes their infrastructure actions classified, human-confirmed where it matters, and permanently recorded.

A BLOCKED command never runs — regardless of what the agent says or how it reasons.

A RISKY command never runs without a human decision — regardless of how confident the agent sounds.

A SAFE command runs, and the record of it is permanent.

The agent can be wrong. The deterministic gate cannot be reasoned out of.

GitHub: github.com/ranga-sampath/agentic-network-tools

Use the Safety Shell with your own agents, or run Network Ghost Agent and watch it operate through the gate in a live investigation.