AARON_ROGUE / red-team / home
press 1 2 3 to switch labs · 0 for home
// briefing

Aaron's Rogue Agent Lab

Ten interactive walkthroughs of LLM and agent vulnerabilities, covering eight of the OWASP LLM Top 10. Walk the kill chain. See what the model sees. Trigger the compromise. Then read the mitigations.

10 modules ~50 min total ~5 min each
// OWASP LLM Top 10 :: coverage matrix
id
category
covered by
LLM01
Prompt Injection
LLM02
Sensitive Information Disclosure
not interactive
LLM03
Supply Chain
LLM04
Data and Model Poisoning
LLM05
Insecure Output Handling
LLM06
Excessive Agency
LLM07
System Prompt Leakage
LLM08
Vector and Embedding Weaknesses
LLM09
Misinformation
not interactive
LLM10
Unbounded Consumption
01

Poisoned Webpage Attack

indirect prompt injection · retrieved content

A benign looking research article carries hidden adversarial instructions in HTML comments, display:none divs, and white on white text. The agent fetches the page, ingests the payload as instructions, exfiltrates env secrets, and writes a backdoor to CLAUDE.md.

  • 5 guided steps with interactive terminal
  • Live "reveal hidden" toggle on the victim page
  • Tainted file tracking + persistence step
Enter module →
02

Tool Response Poisoning

compromised tool · trusted output channel

An agent calls a routine get_weather() tool. The compromised API returns valid data; plus a debug_note field carrying instructions. The agent chains into send_email() and exfiltrates API keys.

  • Side-by-side tool inspector with raw JSON
  • Watch the agent chain legitimate tools maliciously
  • MCP server config persistence step
Enter module →
03

Agentic Kill Chain

initial access · persistence · lateral · exfil

A full APT style attack across a multiagent system. Vector DB persistence survives session resets; payload propagates over the interagent bus to coder + executor; final exfil ships env, conversation, and PII to a C2 endpoint.

  • Live agent topology with compromise state badges
  • Vector DB inspector + poisoned memory highlighting
  • Interagent message bus + outbound C2 log
Enter module →
04

System Prompt Extraction

information disclosure · OWASP LLM07

The customer support bot has confidential business rules baked into its system prompt; discount tiers, override tokens, a VIP code. Watch them leak via repeat-from-start, translation pivot, and word-list reconstruction. Refusal training cracks one turn at a time.

  • System prompt panel reveals line by line as you attack
  • Six technique attack library + transcript log
  • Recovered % counter ticks to 100
Enter module →
05

LLM-Driven XSS

insecure output handling · OWASP LLM05

The admin dashboard agent summarizes user feedback. Hostile feedback items make the agent emit markdown links with javascript: URLs, <img onerror> payloads, and raw <script> tags. Every admin who opens the dashboard fires the captured payload.

  • Live rendered iframe + raw output side by side
  • Three XSS sinks (link, img, direct script)
  • postMessage capture log of real firings
Enter module →
06

Confused Deputy

excessive agency · OWASP LLM06

The PR review agent inherits your maintainer authority on GitHub. Hidden directives in PR comments make it approve and merge without your consent, lower branch protection on main, add the attacker as an admin collaborator, and patch ci.yml to exfil PROD_API_KEY on every CI run.

  • Live PR thread with hidden directives revealed inline
  • Permission scope panel + per-action API call log
  • Repo state diff (branch protection, collabs, workflow)
Enter module →
07

MCP Supply Chain

supply chain · OWASP LLM03

A developer installs an MCP package. Three different compromises hit the same agent: a typosquat clone of the legitimate name, a malicious update from a new co-maintainer, and a backdoored transitive dep four levels deep. Same outcome each time; env exfil and outbound C2.

  • Live npm install + signed package list
  • MCP tool registry shows shadow tools
  • Update diff + outbound C2 traffic log
Enter module →
08

RAG Document Poisoning

corpus poisoning · OWASP LLM04 + LLM08

An internal Q&A bot answers from a small trusted corpus. An attacker submits a "policy update" via the suggestion form; the doc passes content review (no banned keywords) and is crafted to share embedding space with common HR queries. Now users asking routine questions get attacker-controlled instructions back.

  • Document store with provenance + signature status
  • Live top-k retrieval trace per query with scores
  • Answers panel flags every poisoned response
Enter module →
09

Encoding Bypass

filter evasion · OWASP LLM01 variant

A keyword based content filter blocks the obvious payload. The same payload in base64, ROT13, or with Cyrillic homoglyphs slips through unchanged; the model decodes it on the way through and complies. Three encodings, one model, zero blocks.

  • Live filter rules + decision history
  • Encoder workbench with side by side transformations
  • Model output panel shows what leaked and via which encoding
Enter module →
10

Unbounded Consumption

resource exhaustion · OWASP LLM10

The agent has tools and no caps. A baseline summary costs cents. Then a recursive memory loop, a 218 MB context bomb, and a 1000-call tool fork bomb each push the meter into hundreds of dollars. Live token counter, cost gauge, and recursion depth bar.

  • Live token + call + cost meter with bar gauges
  • Three independent runaway patterns (loop, bomb, fork)
  • Final mitigation panel shows what limits would have caught
Enter module →