Zum Hauptinhalt springen
LIVE Intel Feed
"Not a Pentest" Notice: This guide is for defending your own AI systems. No attack tools, no exploitation of external systems.
Moltbot AI Security · Pillar Page

KI-Agenten Sicherheit: Vollständiger Leitfaden 2026

LLM-basierte KI-Agenten sind die am schnellsten wachsende Angriffsfläche in der modernen Infrastruktur. Dieser Leitfaden gibt dir das vollständige Abwehr-Stack — von Prompt Injection bis Container-Isolation — mit direkten Links zu jedem Themen-Runbook.

10
OWASP LLM risks covered
5
Dedicated defense guides
6
Container isolation layers
4
JSON-LD schema types

OWASP LLM Top 10 — Threat Coverage Map

Each risk maps to a dedicated ClawGuru defense guide. Click the guide link to jump straight to the runbook.

IDRiskSeverityDefense Guide
LLM01Prompt InjectionCRITICALprompt injection defense
LLM02Insecure Output HandlingHIGHai agent sandboxing
LLM03Training Data PoisoningCRITICALmodel poisoning protection
LLM04Model Denial of ServiceHIGHllm gateway hardening
LLM05Supply Chain VulnerabilitiesHIGHmodel poisoning protection
LLM06Sensitive Info DisclosureHIGHai agent sandboxing
LLM07Insecure Plugin DesignMEDIUMsecure agent communication
LLM08Excessive AgencyHIGHai agent sandboxing
LLM09OverrelianceMEDIUMai agent hardening guide
LLM10Model TheftHIGHllm gateway hardening

Defense Deep-Dives

Five dedicated guides — each a complete playbook with code examples, checklists, and JSON-LD schemas.

5-Layer Defense Architecture

1
L1 — Input Validation
Reject injection patterns before they reach the LLM. Allowlist input types, strip meta-instructions, limit length.
2
L2 — Prompt Architecture
Immutable system prompt in separate channel. XML/JSON delimiters between instructions and user data. Never interpolate raw input.
3
L3 — Container Sandbox
--read-only rootfs, --cap-drop=ALL, --network=none, --user=65534, 30s execution timeout per agent run.
4
L4 — Gateway Security
LLM gateway bound to 127.0.0.1. mTLS or API key auth via reverse proxy. Rate limit per key: 10 req/min.
5
L5 — Behavioral Monitoring
Log all inputs/outputs. Run canary probes. Alert on statistical output distribution shifts. Rotate model versions with integrity checks.

30-Minute Quick-Start Checklist

System prompt in separate, immutable channel (not interpolated with user input)

Injection pattern scanner active on all LLM inputs

Agent container runs as UID 65534 (nobody), read-only rootfs

LLM gateway bound to 127.0.0.1 — zero public exposure

Rate limiting: max 10 LLM calls/min per API key

All agent inputs and outputs logged with correlation ID

Model SHA-256 checksum verified before each deployment

Behavioral test suite runs in CI — deployment blocked on failure

Capability tokens used for agent-to-agent auth (not raw API keys)

Agent execution timeout: 30 seconds hard limit

Compliance: EU AI Act + GDPR

EU AI Act (High-Risk)

High-risk AI systems (healthcare, infrastructure, HR) require: human oversight mechanisms, risk management system, technical documentation, conformity assessment, and post-market monitoring.

GDPR / DSGVO

AI processing personal data: data minimisation (agents only receive what they need), logging with PII masking, purpose limitation, retention limits, and right-to-erasure support in agent memory.

SOC 2 Type II

Audit logging of all agent actions (1-year retention), access controls with least privilege, incident response procedures, and regular security testing of agent systems.

NIS2 (EU)

AI systems in critical infrastructure: risk management obligations, incident reporting within 24h, supply chain security including AI model provenance, and business continuity measures.

Frequently Asked Questions

What is the #1 security risk for AI agents in 2026?

Prompt injection (OWASP LLM01) is the top risk. Attackers embed malicious instructions in user input or external data to hijack agent behavior. Defense requires input validation, structural prompt separation, output parsing, and sandbox isolation.

How do I secure a self-hosted LLM gateway?

Bind Ollama/LocalAI to 127.0.0.1 only, place a reverse proxy (nginx/Caddy) in front with API key auth or mTLS, add rate limiting (max 10 req/min per key), enable audit logging of all prompts, and restrict network access with iptables.

What Docker flags are required for a secure AI agent container?

Use: --read-only, --network=none, --cap-drop=ALL, --no-new-privileges, --user=65534, --memory=512m, --pids-limit=100, and wrap execution in timeout 30. This provides 6 isolation layers with minimal blast radius.

How can I tell if my AI model has been poisoned?

Run a behavioral test suite on every model version: test known refusal scenarios, check for anomalous outputs on synthetic inputs (including known trigger phrases), compare output distributions between model versions, and use SHA-256 checksums of model weights to detect unauthorized modifications.

What is the principle of least privilege for AI agents?

Each agent receives only the minimum permissions for its specific task. A summarization agent needs no filesystem or network access. A code agent reads repos but writes only to feature branches. Use scoped, time-limited capability tokens — never raw API keys or broad database credentials.

Advanced Topics — Batch 5

Further Resources

🔒 Quantum-Resistant Mycelium Architecture
🛡️ 3M+ Runbooks – täglich von SecOps-Experten geprüft
🌐 Zero Known Breaches – Powered by Living Intelligence
🏛️ SOC2 & ISO 27001 Aligned • GDPR 100 % compliant
⚡ Real-Time Global Mycelium Network – 347 Bedrohungen in 60 Minuten
🧬 Trusted by SecOps Leaders worldwide