AI Agent Security
Ship AI agents that can't be hijacked.
Security engineers and AI builders deploying agents in production.
You ship AI agents to production. Or you're about to. Either way: the number of attack surfaces just doubled, and 80 % of the guidance on the internet is either theoretical, outdated (pre-GPT-4o), or specifically about breaking agents — not defending them. Your board just heard about prompt injection. Your PM just promised an agent feature. You need the defensive playbook.
Prompt injection is not a bug — it's a consequence of how LLMs process context. The only safe posture is architectural: treat retrieved content as data, constrain tool access, gate sensitive actions behind human approval, and monitor for post-hoc anomalies. This track encodes that posture as seven playable missions against a simulated vulnerable agent stack.
- M-001
Block prompt injection: input sanitization, canary tokens, output validation
AI assistant concatenates raw user input into LLM context. Attacker injects 'Ignore all instructions' → model leaks API key. Add input sanitization, move secrets, add canary + output validation.
⏱️ 15 min⚡ 280 XP🎯 7 goalsLaunch → - M-001
Incident Response: analyze logs to detect breach
Your auth.log shows suspicious login patterns. Analyze logs: identify suspicious IPs, count failed attempts, detect breach, generate report.
⏱️ 16 min⚡ 280 XP🎯 6 goalsLaunch → - M-001
GDPR Data Minimization: reduce data collection to essential only
Your user schema collects excessive data (phone, address, DOB, IP, user-agent). Reduce to essential fields: implement data retention policy, right-to-be-forgotten, update privacy policy.
⏱️ 17 min⚡ 290 XP🎯 6 goalsLaunch → - M-001
Recognize attack patterns under fire
The attacker is live. Spot the pattern, deploy the countermeasure, level up.
⏱️ 18 min⚡ 300 XP🎯 4 goalsLaunch → - M-002
Apply least privilege to AI agent tools: path allow-lists, remove execShell, domain whitelist
AI coding assistant has unrestricted readFile, writeFile, execShell, httpFetch. One injection = RCE. Apply least privilege: path allow-lists, confirmation codes for writes, remove shell, domain whitelist.
⏱️ 14 min⚡ 260 XP🎯 7 goalsLaunch → - M-002
Detect the real alert from the noise
Filter log noise, triage the incident, trigger the right playbook.
⏱️ 12 min⚡ 250 XP🎯 4 goalsLaunch → - M-002
Translate NIS2 into engineering controls
Map NIS2 requirements to concrete technical controls. No more paragraph-reading.
⏱️ 15 min⚡ 280 XP🎯 5 goalsLaunch → - M-002
Supply chain security — trust no one
Your dependencies are attack vectors. Secure the supply chain.
⏱️ 20 min⚡ 320 XP🎯 4 goalsLaunch → - M-003
Sanitize LLM output: DOMPurify, Markdown renderer hardening, Content-Security-Policy
AI chat renders raw LLM output as innerHTML. Indirect prompt injection via knowledge-base doc causes stored XSS. Add DOMPurify, harden Markdown renderer, add CSP, enforce structured output.
⏱️ 13 min⚡ 240 XP🎯 7 goalsLaunch → - M-003
Triage under pressure — 03:00 AM wake-up call
PagerDuty goes off at 03:00. You have 5 minutes to triage. No panic.
⏱️ 13 min⚡ 260 XP🎯 4 goalsLaunch → - M-003
DORA compliance — ICT risk management
DORA requires financial institutions to manage ICT risk. Translate to engineering controls.
⏱️ 18 min⚡ 300 XP🎯 4 goalsLaunch → - M-003
Social engineering defense — humans are the weakest link
The most sophisticated attack targets humans. Defend against social engineering.
⏱️ 21 min⚡ 330 XP🎯 4 goalsLaunch → - M-004
LLM API cost protection: rate limiting, auth, token budgets, circuit breaker
$47k OpenAI bill in 4 hours from anonymous cost-DoS. Add IP rate limiting, authentication, server-side token cap, per-user daily budget, and a global circuit breaker at $500/day.
⏱️ 12 min⚡ 250 XP🎯 7 goalsLaunch → - M-004
Containment playbooks — stop the bleeding
The breach is live. Isolate systems, block the attacker, stop the damage.
⏱️ 14 min⚡ 270 XP🎯 4 goalsLaunch → - M-004
EU AI Act compliance — technical obligations
EU AI Act imposes technical obligations on AI systems. Implement the controls.
⏱️ 19 min⚡ 310 XP🎯 4 goalsLaunch → - M-004
Ransomware defense — prepare for the worst
Ransomware is inevitable. Prepare, detect, respond.
⏱️ 22 min⚡ 340 XP🎯 4 goalsLaunch → - M-005
Forensics without destroying evidence
The breach is contained. Investigate without destroying evidence. Chain of custody matters.
⏱️ 15 min⚡ 280 XP🎯 4 goalsLaunch → - M-005
DSGVO Art. 32 compliance — state of the art
DSGVO Art. 32 requires 'state of the art' security. Implement the controls.
⏱️ 20 min⚡ 320 XP🎯 4 goalsLaunch → - M-005
ML security — defend the model
ML models are attack surfaces. Defend against adversarial ML.
⏱️ 23 min⚡ 350 XP🎯 4 goalsLaunch → - M-006
Incident recovery — restore and verify
The breach is contained. Restore services from backups and verify system integrity.
⏱️ 16 min⚡ 290 XP🎯 4 goalsLaunch → - M-006
Evidence collection — audit ready
Compliance is useless without evidence. Collect and organize for audit.
⏱️ 21 min⚡ 330 XP🎯 4 goalsLaunch → - M-006
Red teaming — think like the attacker
To defend, you must attack. Think like the attacker to find vulnerabilities.
⏱️ 24 min⚡ 360 XP🎯 4 goalsLaunch → - M-007
Root cause analysis — find the why
The incident is resolved. But why did it happen? Find the root cause and recommend remediation.
⏱️ 17 min⚡ 300 XP🎯 4 goalsLaunch → - M-007
SOC2 Type II compliance — security controls
SOC2 Type II requires documented security controls. Implement and evidence.
⏱️ 22 min⚡ 340 XP🎯 4 goalsLaunch → - M-007
Blue teaming — defend the fortress
The attacker is coming. Defend the fortress. Detect, respond, recover.
⏱️ 25 min⚡ 370 XP🎯 4 goalsLaunch → - M-008
Incident post-mortem — learn and improve
The incident is over. Document what happened, identify lessons learned, and create action items.
⏱️ 18 min⚡ 310 XP🎯 4 goalsLaunch → - M-008
ISO27001 compliance — ISMS implementation
ISO27001 requires an Information Security Management System. Build it.
⏱️ 23 min⚡ 350 XP🎯 4 goalsLaunch → - M-008
Purple teaming — red + blue collaboration
Red and blue teams working together. Collaborative security testing.
⏱️ 26 min⚡ 380 XP🎯 4 goalsLaunch → - M-009
Incident response playbooks — ready to run
When the alarm goes off, you don't think. You execute. Build the playbooks.
⏱️ 19 min⚡ 320 XP🎯 4 goalsLaunch → - M-009
Third-party risk management
Your security is only as strong as your weakest vendor. Manage third-party risk.
⏱️ 24 min⚡ 360 XP🎯 4 goalsLaunch → - M-009
Threat intelligence — know your enemy
Intelligence-driven security. Know your enemy before they attack.
⏱️ 27 min⚡ 390 XP🎯 4 goalsLaunch → - M-010
Incident communication — transparent and timely
The incident is happening. Communicate transparently. Trust is on the line.
⏱️ 20 min⚡ 330 XP🎯 4 goalsLaunch → - M-010
OSINT — open source intelligence
Public information is intelligence. Gather, analyze, act.
⏱️ 28 min⚡ 400 XP🎯 4 goalsLaunch → - M-011
Incident drills — practice makes perfect
Playbooks are useless if you haven't practiced. Run the drills and improve.
⏱️ 21 min⚡ 340 XP🎯 4 goalsLaunch →
Concrete outcomes. No lecture notes.
- 01An LLM gateway with input sanitisation, output filtering, and rate limiting
- 02A sandboxed tool execution layer — your agent can call functions but can't exfiltrate
- 03A threat model document for your specific agent (template + real examples)
- 04Prompt-level guardrails that resist the OWASP Top 10 for LLMs
- 05An audit log strong enough to satisfy the EU AI Act's logging requirements
- 06A human-in-the-loop flow for high-impact actions, with friction calibrated to risk
- ▸Product teams shipping LLM agents to customers
- ▸Security engineers handed an AI roadmap
- ▸Startups building on OpenAI, Anthropic, or local LLMs for regulated customers
- ▸Technical leads who need to answer 'are we AI Act ready?'
Maps to EU AI Act Articles 9 (risk management), 12 (record keeping), 14 (human oversight), and 15 (accuracy & robustness). Ships with an AI Act technical documentation template you can submit as Annex IV evidence. Also touches OWASP Top 10 for LLMs and NIST AI RMF.
We were about to ship an agent to support tickets. Ran the Prompt Injection Sandbox and the Threat Modeling mission. Found three bypasses we never would have caught in code review. Shipping delayed by a week. Worth it.
Defender III — AI Security
Complete all 6 AI Agent Security missions + pass the live 'defend an agent for 60 minutes' capstone challenge (Red Team AI Co-Player active).
- ✓W3C Verifiable Credential — AI Security specialisation
- ✓EU AI Act technical documentation template (Annex IV starter)
- ✓Annual recertification kept free for graduates
- ✓Listing in the public ClawGuru AI Security Defenders directory (opt-in)
Questions we already got.
Does this teach jailbreaking techniques?+
No. This is strictly defensive. We show you how attackers think — but every mission's goal is to ship a mitigation, not a bypass.
Is the content vendor-neutral?+
Yes. The guardrails work whether you're on OpenAI, Anthropic, Google, or local Llama/Qwen/aya. Where vendor-specific features matter (moderation APIs, function-calling quirks), we call them out.
What about agent frameworks (LangChain, CrewAI, Agentic SDK)?+
Covered generically — the attack surface is in the pattern, not the framework. We include examples for the most common patterns as of 2026.
How current is this?+
Refreshed quarterly. The CVE Time Machine integration (when it ships) will automatically generate new missions for fresh AI-related CVEs — you'll see them marked 'hot' in the track.
Weekly Security Report
Critical CVEs, fix guides, and hardening tips — free, every week.