AI-Powered Network Analysis from Raw Packet Captures
For network engineers, cloud architects, and the product leaders who support them.
The Problem
A packet capture is the most honest evidence available in a network investigation. It is also the hardest to read.
Raw .pcap files do not have opinions. They record every frame — retransmissions,
DNS timeouts, ARP replies, RST teardowns — without context, without prioritization,
and without telling you which of the 134 packets (or 134,000) actually explain why
the application is failing.
The engineer who can open a PCAP in Wireshark, apply the right filters, identify the retransmission pattern, correlate it with the ICMP “Host Unreachable” from the upstream router, and connect that to the DNS SERVFAIL for the same destination — that engineer exists. There are not many of them. And when the on-call rotation lands on someone else, the PCAP sits there, unread, while the investigation stalls at the control plane.
The Agentic PCAP Forensic Engine was built to remove that dependency.
How Engineers Analyze PCAPs Today
Engineer receives .pcap file
→ Opens Wireshark or runs tshark manually
→ Applies display filters one at a time: tcp.analysis.retransmission
→ Counts retransmissions per stream manually
→ Does not correlate with ICMP Unreachable from upstream router
→ Opens Statistics > DNS tab
→ Sees NXDOMAIN count — does not check if the subdomain pattern is DGA
→ Exports TCP streams to CSV, opens in spreadsheet
→ Calculates RTT percentiles manually — tedious, error-prone
→ Documents findings in a text file
→ No structured severities, no remediation commands, no frame citations
→ Result: protocol-by-protocol notes that miss cross-layer root cause
Each protocol is analyzed in isolation, so cross-layer correlations are missed. The depth of analysis is bounded by the analyst’s recall of RFC failure modes under time pressure.
What the PCAP Forensic Engine Does
It is a four-stage pipeline that transforms a raw packet capture into a ranked, actionable forensic report.
┌──────────────────────────────────────────────────────────────┐
│ Input: capture.pcap or capture.cap │
└───────────────────────────┬──────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────┐
│ Stage 1 — Protocol Extraction (tshark) │
│ │
│ ARP · ICMP · TCP · DNS — each in a dedicated tshark pass │
│ Per-packet fields: flags, RTT, stream IDs, RCODE, MACs │
└───────────────────────────┬──────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────┐
│ Stage 2 — Semantic Reduction │
│ │
│ Packet rows → protocol statistics + structured anomaly flags│
│ Percentile aggregation: min / median / p95 / max │
│ Up to 95% data reduction — diagnostic signal preserved │
│ Output: Semantic JSON (~10–50 KB) │
└───────────────────────────┬──────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────┐
│ Stage 3 — AI Forensic Reasoning (Gemini 2.0 Flash) │
│ │
│ Cross-protocol correlation │
│ RFC-grounded anomaly interpretation │
│ Severity ranking: CRITICAL > HIGH > MEDIUM > LOW > INFO │
└───────────────────────────┬──────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────┐
│ Stage 4 — Forensic Report (Markdown) │
│ │
│ Executive Summary · Anomaly Table · RCA · Remediation │
└──────────────────────────────────────────────────────────────┘
Stage 2 is the critical step. An LLM cannot reason reliably over thousands of raw packet rows. It can reason precisely over a compact statistical summary — which is why the Semantic JSON exists.
A Real Capture, Analyzed
uv run python pcap_forensics.py kitchen_sink.pcap
The capture: 134 packets, 14.9 seconds, four protocols present — ARP, ICMP, TCP, DNS.
Step 1: Semantic JSON
134 packets reduce to a compact JSON. The ARP section, verbatim from the actual output:
"arp": {
"total_requests": 7,
"total_replies": 9,
"unanswered_requests": [{"ip": "10.0.0.99", "count": 3}],
"gratuitous_arp_count": 1,
"duplicate_ip_alerts": [
{
"ip": "10.0.0.5",
"macs": ["aa:bb:cc:dd:ee:05", "ff:ee:dd:cc:bb:aa"],
"sample_frames": [7, 10]
}
]
}
One field surfaces the critical finding: IP 10.0.0.5 is claimed by two different MAC addresses. In the raw capture this is buried across 134 frames. In the Semantic JSON it is a structured alert — protocol, IP, both MACs, and the exact frames.
The TCP section captures anomalies at stream granularity:
{
"stream_id": 1,
"src": "10.0.0.1:40001",
"dst": "10.0.0.2:80",
"retransmissions": 5,
"rst": true,
"ack_rtt_ms": {"min": 1.0, "median": 210.0, "p95": 210.0, "max": 210.0},
"sample_frames": [68, 71, 74, 77, 80]
}
Stream 0 on the same destination has a median ACK RTT of 5ms. Stream 1 is at 210ms — 42x worse — and ends with a RST. Both streams go to the same host. The difference is the signal.
Step 2: Forensic Report
The executive summary from the engine:
“The capture contains a critical security alert: multiple NXDOMAIN responses for domains matching a DGA pattern (“evil-c2.com”) indicate malware beaconing to command-and-control servers. Additionally, there is an IP-MAC conflict detected for IP address 10.0.0.5, signaling potential ARP spoofing. Host 10.0.0.99 is confirmed unreachable at both Layer 2 and Layer 3 — three unanswered ARP requests establish it is absent from the segment, and router 10.0.0.254 corroborates this with an ICMP Host Unreachable. Separately, TCP stream 1 is experiencing significant retransmissions to 10.0.0.2:80, and slow DNS queries are impacting resolution times.”
The anomaly table — exact output from the engine, ranked by severity:
| Severity | Protocol | Issue | Detail | Frame(s) |
|---|---|---|---|---|
| CRITICAL | DNS | DGA Detection | Multiple NXDOMAIN responses for domains under “evil-c2.com” (e.g., ohgi1jny.evil-c2.com, t7rpwh6n.evil-c2.com) suggest DGA malware activity. | 104, 106, 108, 110, 112 |
| CRITICAL | ARP | IP-MAC Conflict | Multiple MAC addresses (aa:bb:cc:dd:ee:05, ff:ee:dd:cc:bb:aa) claim IP 10.0.0.5, indicating potential ARP spoofing. | 7, 10 |
| HIGH | ICMP | Host Unreachable | Router 10.0.0.254 reports “Host Unreachable” (code 1) for 10.0.0.99; error sent to 10.0.0.1. Corroborates unanswered ARP: 10.0.0.99 is confirmed unreachable at both L2 and L3. | 43 |
| HIGH | DNS | SERVFAIL | The domain “broken.internal” returns SERVFAIL, indicating a problem with the authoritative DNS server. | 114 |
| MEDIUM | TCP | Retransmissions | TCP stream 1 (10.0.0.1:40001 → 10.0.0.2:80) experiences 5 retransmissions. | 68, 71, 74, 77, 80 |
| MEDIUM | ICMP | Elevated RTT | ICMP RTT has a median of 5ms but a p95 of 300ms, indicating occasional significant latency spikes. | 37, 39, 41 |
| MEDIUM | DNS | Slow Queries | Queries for slow0.remote.com, slow1.remote.com, and slow2.remote.com take 500ms each. | 120, 122, 124 |
| LOW | ARP | Unanswered ARP Requests | 3 unanswered ARP requests for 10.0.0.99 | N/A |
| LOW | ICMP | TTL Exceeded | TTL Exceeded messages originate from 172.16.0.1, suggesting a potential routing issue. | 46 |
Nine findings across four protocols, ranked by severity, every finding tied to specific frames. The remediation section that follows gives exact CLI commands for each — not generic guidance.
Cross-Protocol Correlation
ARP unanswered requests + ICMP Host Unreachable — two layers, one dead host
The ARP table shows 3 unanswered requests for 10.0.0.99. On its own, that is a LOW finding — the host might be powered off, on the wrong VLAN, or simply slow to respond.
The ICMP layer adds the routing-layer confirmation. The Semantic JSON records:
{
"src": "10.0.0.254",
"dst": "10.0.0.1",
"code": 1,
"code_meaning": "Host Unreachable",
"unreachable_dst": "10.0.0.99"
}
This is standard ICMP Destination Unreachable behaviour: the router sends the error
back to the original sender of the failed packet. dst (10.0.0.1) is the sender
who receives the notification. unreachable_dst (10.0.0.99) is the host the router
could not reach — extracted from the inner IP header embedded in the ICMP payload.
Now the picture is complete. 10.0.0.1 tried to reach 10.0.0.99. The switch could not resolve 10.0.0.99’s MAC (unanswered ARP). Router 10.0.0.254 also cannot forward traffic to 10.0.0.99. Both layers agree: the host is absent. That changes the LOW ARP finding into a confirmed infrastructure gap — something to act on, not just monitor.
DNS NXDOMAIN pattern → DGA malware detection
Five NXDOMAIN responses for subdomains under evil-c2.com: ohgi1jny, t7rpwh6n,
nyfx4m8t, 1qojah9t, q6v9mh7f. The randomness of the subdomain strings is the
signal — no human types these. Gemini identifies the Domain Generation Algorithm
pattern and escalates to CRITICAL. A threshold rule would flag “5 NXDOMAINs.” The engine
reads the subdomain entropy, names the threat class, and gives the remediation. From the
actual report:
“Identify the infected host: examine DNS query logs on the DNS server (10.0.0.53) to find the source IP address making the NXDOMAIN queries to domains like ‘ohgi1jny.evil-c2.com’. Isolate the infected host: disconnect from the network immediately to prevent further communication with the C2 server.”
The output is not a list of statistics. It is a diagnosis.
Three Operating Modes
Single Capture — Forensic Report
Default mode. One .pcap or .cap file in, one Markdown report out.
uv run python pcap_forensics.py capture.pcap
Output: capture_forensic_report.md
Temporal Comparison — Before vs. After
Two captures from the same segment at different times. The engine computes per-metric deltas and classifies each as STABLE, NEW ISSUE, REGRESSION, or RESOLVED. The change summary table, exact output from the comparison report:
| Protocol | Metric | Capture A | Capture B | Delta | Assessment |
|---|---|---|---|---|---|
| ARP | IP-MAC Conflicts | 0 | 1 | +1 | NEW ISSUE |
| ICMP | RTT Median (ms) | 5.0 | 5.0 | 0 | STABLE |
| TCP | Retransmission Rate | 0% | 0% | 0 | STABLE |
| TCP | Handshake Success Rate | 100% | 100% | 0 | STABLE |
| DNS | Latency Median (ms) | 15.0 | 15.0 | 0 | STABLE |
Everything stable — except a new ARP IP-MAC conflict that appeared between captures. Without the comparison, this would be invisible in the current-state capture alone.
uv run python pcap_forensics.py baseline.pcap --compare current.pcap --mode temporal
Output: baseline_vs_current_comparison.md
Endpoint Correlation — Source vs. Destination
Two simultaneous captures from both ends of a path. If a flow is present at the source
but absent at the destination, the loss is in the path, not the endpoints — a finding
no NSG or route table query can produce. Drop rate is calculated per matched flow:
(source packets − destination packets) / source packets.
uv run python pcap_forensics.py source.pcap --compare dest.pcap --mode endpoint-correlation
Output: source_vs_dest_comparison.md
What It Detects
| Protocol | What the Engine Looks For |
|---|---|
| ARP | IP-MAC conflicts (ARP spoofing / cache poisoning); unanswered requests (host down or unreachable); gratuitous ARP announcements (VRRP/HSRP failover, NIC teaming) |
| ICMP | Fragmentation Needed (type 3 code 4) — path MTU constraint, correlated with TCP retransmissions on large segments to identify MTU mismatch; routing loops (TTL Exceeded repeatedly from the same source); ICMP Redirect — potential traffic interception signal |
| TCP | Retransmission storms — scattered across many streams (network-wide loss) vs. concentrated on one stream (endpoint or application problem); 3 duplicate ACKs triggering Fast Retransmit (RFC 5681); zero-window stalls (application not reading fast enough — not a network problem); RST origin analysis (local endpoint teardown vs. middlebox injection); bufferbloat (ACK RTT p95/median > 10) |
| DNS | DGA malware (high-entropy random NXDOMAIN subdomain patterns); DNS tunneling (TXT queries > 20% of total); resolver overload (unanswered query rate > 5%); latency outliers; SERVFAIL attribution |
The default for any pattern the engine cannot classify confidently is to surface it at INFO severity. No signal is silently discarded.
What It Takes to Run
Prerequisites:
- Python 3.12+ with uv
- tshark —
brew install wiresharkorapt install tshark - A Gemini API key from aistudio.google.com — the free tier handles most captures
Configuration: Set GEMINI_API_KEY in a .env file in the project directory.
No other configuration required for standalone use.
Run:
uv run python pcap_forensics.py your-capture.pcap
Cost per analysis: A typical analysis with Gemini 2.0 Flash costs under $0.02.
Standalone Tool, or Part of a Larger System
Standalone: Point it at any .pcap or .cap file — a tcpdump on a Linux VM,
a Wireshark session on a laptop, a switch span port capture, or an Azure Network
Watcher download. No cloud infrastructure required.
Integrated: Inside Ghost Agent, the engine is the final stage of an automated capture-and-analysis pipeline:
Ghost Agent — control-plane analysis exhausted, wire-level evidence needed
│
▼
Cloud Orchestrator — creates Azure Network Watcher capture
│ polls until Succeeded → downloads .cap blob from storage
▼
PCAP Forensic Engine — runs against the downloaded file
│ produces capture_forensic_report.md
▼
Ghost Agent — reads report, incorporates wire-level findings into RCA
Ghost Agent does not call pcap_forensics.py directly — it invokes it through the
Safety Shell, so the execution is logged, auditable, and classified by the same
four-tier pipeline as every other command. The forensic report is auto-approved for
reading (it is a file the system itself created), so Ghost Agent reads it without
triggering a human approval prompt.
Conclusion
A packet capture is definitive evidence. It is also unusable without the expertise to read it. The PCAP Forensic Engine encodes that expertise — protocol failure mode knowledge, RFC semantics, cross-layer correlation — into a repeatable pipeline that produces a ranked, frame-cited forensic report from any capture file.
Run it standalone. Drop it into your automation. Or let Ghost Agent invoke it when the control plane runs out of answers.
GitHub: github.com/ranga-sampath/agentic-network-tools
Clone the repo. Run it against a capture. Read the report.
