Agent Memory Security: KI-Gedächtnis absichern
Agent-Speicher ist ein persistentes Angriffsziel: einmal vergiftete Erinnerungen beeinflussen jeden zukünftigen Agenten-Aufruf. Sechs Angriffsvektoren, konkrete Mitigations und DSGVO-konforme Löschung.
Was ist Agent Memory Security? Einfach erklärt
Agent Memory Security ist wie ein Tresor für KI-Erinnerungen: Agenten speichern Informationen über Gespräche und Fakten. Diese Speicher werden verschlüsselt und isoliert. PII wird erkannt und markiert. DSGVO-Löschung ist möglich. Ohne Memory Security kann ein Angreifer Erinnerungen vergiften, die jeden zukünftigen Agenten-Aufruf beeinflussen — ein persistenter Angriffsvektor.
↓ Springe zu Angriffsvektoren
Angriffsvektoren & Mitigations
Attacker injects malicious content into agent long-term memory. On future retrieval, poisoned memory manipulates agent behavior — persistent across sessions.
Fix: Content validation on memory write. Hash-verified retrieval. Anomaly detection on memory update patterns.
Agent A retrieves memories belonging to Agent B or User B via crafted queries. Common in shared vector databases without namespace isolation.
Fix: Per-agent, per-user namespace isolation in vector DB. Scope enforcement on every retrieval query. No shared embedding spaces.
Injected content stored in memory is later retrieved and included in an LLM prompt — causing injection at retrieval time, not just at input time.
Fix: Scan retrieved memory chunks for injection patterns before including in prompt. Treat memory as untrusted user input.
Personally identifiable information stored in agent memory without expiry or deletion mechanism. Violates GDPR Art. 5 and Right to Erasure (Art. 17).
Fix: PII detection on memory write. Configurable retention. Right-to-erasure API that purges all user-linked embeddings.
Stale or replayed memories from old sessions used to influence current agent behavior. Old authorization context replayed to bypass current access controls.
Fix: Timestamp-based memory expiry. Session binding on sensitive memories. Version tokens on memory entries.
Vector embeddings stored in DB can be partially inverted to recover original text. Sensitive information recoverable from embeddings alone.
Fix: Encrypt embeddings at rest. Use embedding-only indexes (not raw text) when possible. Access control on vector DB.
Sichere Memory-Konfiguration
# moltbot.memory.yaml — secure agent memory configuration
memory:
backend: pgvector # or chroma, weaviate, qdrant
encryption:
at_rest: aes-256-gcm # Encrypt embeddings + raw text
key_rotation: 90d # Rotate encryption keys every 90 days
isolation:
namespace_by_agent: true # Each agent ID → own namespace
namespace_by_user: true # Each user ID → own namespace within agent
cross_agent_reads: false # Never allow agent A to read agent B's memory
security:
injection_scan_on_write: true # Scan content before storing
injection_scan_on_read: true # Scan retrieved chunks before prompt inject
pii_detection: true # Detect PII on write, flag for review
pii_auto_tag: true # Tag memories containing PII for deletion tracking
retention:
default_ttl_days: 90 # Auto-expire memories after 90 days
user_data_ttl_days: 365 # Configurable per data class
on_erasure_request: immediate # GDPR Art. 17 — delete within 24h
audit:
log_all_reads: true # Record every memory retrieval with user+agent+query_hash
log_all_writes: true
retention_years: 3 # Audit log retentionHäufige Fragen
What is agent memory and why is it a security risk?
Agent memory is persistent storage that allows AI agents to retain information across conversations and sessions. It typically uses a vector database (Pinecone, Chroma, Weaviate, pgvector) to store embeddings of past interactions, facts, and user preferences. The security risk: this memory is read back into LLM prompts at retrieval time — making it a persistent attack surface. Any content injected into memory (directly or via a previous conversation) can influence future agent behavior.
How does Moltbot isolate memory between users and agents?
Moltbot enforces namespace isolation at three levels: 1) Agent-level: each agent ID gets its own embedding namespace. An agent cannot query outside its namespace. 2) User-level: within an agent, each user's memories are further isolated by user_id namespace. Retrievals are always scoped to the current user. 3) Permission-level: sensitive memory types (authentication context, financial data) require explicit capability tokens to retrieve, even for the owning agent.
How do I implement GDPR Right to Erasure for agent memory?
GDPR Art. 17 requires deletion of personal data on request within 30 days. For agent memory: 1) Tag all memories with user_id at write time. 2) Maintain a deletion index. 3) On erasure request: delete all embeddings tagged with user_id from vector DB, delete raw text from any backing store, delete from deletion index, generate erasure confirmation log with timestamp. Moltbot's erasure API handles all of this: moltbot.memory.erase_user(user_id='u123', confirm=True).
Can prompt injection via retrieved memory be fully prevented?
Not 100%, but risk can be reduced to near-zero: 1) Scan every retrieved memory chunk with an injection detection model before including in prompt. 2) Separate trusted memory (agent-written) from untrusted memory (user-sourced) using different namespaces with different trust levels. 3) Use structured memory (key-value facts) instead of raw text where possible — much harder to inject into. 4) Limit retrieved memory context to 20% of total prompt to reduce injection surface. 5) Run memory-retrieved prompts through a separate safety classifier before execution.