Agent Memory Security: Securing AI Agent Memory
Agent memory is a persistent attack surface: once poisoned, memories influence every future agent call. Six attack vectors, concrete mitigations and GDPR-compliant erasure.
What is Agent Memory Security? Simply Explained
Agent memory security is like a vault for AI memories: agents store information about conversations and facts. This memory is encrypted and isolated. PII is detected and flagged. GDPR erasure is possible. Without memory security, an attacker can poison memories that influence every future agent call — a persistent attack vector.
↓ Jump to attack vectors
Attack Vectors & Mitigations
Attacker injects malicious content into agent long-term memory. On future retrieval, poisoned memory manipulates agent behavior — persistent across sessions.
Fix: Content validation on memory write. Hash-verified retrieval. Anomaly detection on memory update patterns.
Agent A retrieves memories belonging to Agent B or User B via crafted queries. Common in shared vector databases without namespace isolation.
Fix: Per-agent, per-user namespace isolation in vector DB. Scope enforcement on every retrieval query. No shared embedding spaces.
Injected content stored in memory is later retrieved and included in an LLM prompt — causing injection at retrieval time, not just at input time.
Fix: Scan retrieved memory chunks for injection patterns before including in prompt. Treat memory as untrusted user input.
Personally identifiable information stored in agent memory without expiry or deletion mechanism. Violates GDPR Art. 5 and Right to Erasure (Art. 17).
Fix: PII detection on memory write. Configurable retention. Right-to-erasure API that purges all user-linked embeddings.
Stale or replayed memories from old sessions used to influence current agent behavior. Old authorization context replayed to bypass current access controls.
Fix: Timestamp-based memory expiry. Session binding on sensitive memories. Version tokens on memory entries.
Vector embeddings stored in DB can be partially inverted to recover original text. Sensitive information recoverable from embeddings alone.
Fix: Encrypt embeddings at rest. Use embedding-only indexes (not raw text) when possible. Access control on vector DB.
Secure Memory Configuration
# moltbot.memory.yaml — secure agent memory configuration
memory:
backend: pgvector # or chroma, weaviate, qdrant
encryption:
at_rest: aes-256-gcm # Encrypt embeddings + raw text
key_rotation: 90d # Rotate encryption keys every 90 days
isolation:
namespace_by_agent: true # Each agent ID → own namespace
namespace_by_user: true # Each user ID → own namespace within agent
cross_agent_reads: false # Never allow agent A to read agent B's memory
security:
injection_scan_on_write: true # Scan content before storing
injection_scan_on_read: true # Scan retrieved chunks before prompt inject
pii_detection: true # Detect PII on write, flag for review
pii_auto_tag: true # Tag memories containing PII for deletion tracking
retention:
default_ttl_days: 90 # Auto-expire memories after 90 days
user_data_ttl_days: 365 # Configurable per data class
on_erasure_request: immediate # GDPR Art. 17 — delete within 24h
audit:
log_all_reads: true # Record every memory retrieval with user+agent+query_hash
log_all_writes: true
retention_years: 3 # Audit log retentionFrequently Asked Questions
What is agent memory and why is it a security risk?
Agent memory is persistent storage that allows AI agents to retain information across conversations and sessions. It typically uses a vector database (Pinecone, Chroma, Weaviate, pgvector) to store embeddings of past interactions, facts, and user preferences. The security risk: this memory is read back into LLM prompts at retrieval time — making it a persistent attack surface. Any content injected into memory (directly or via a previous conversation) can influence future agent behavior.
How does Moltbot isolate memory between users and agents?
Moltbot enforces namespace isolation at three levels: 1) Agent-level: each agent ID gets its own embedding namespace. An agent cannot query outside its namespace. 2) User-level: within an agent, each user's memories are further isolated by user_id namespace. Retrievals are always scoped to the current user. 3) Permission-level: sensitive memory types (authentication context, financial data) require explicit capability tokens to retrieve, even for the owning agent.
How do I implement GDPR Right to Erasure for agent memory?
GDPR Art. 17 requires deletion of personal data on request within 30 days. For agent memory: 1) Tag all memories with user_id at write time. 2) Maintain a deletion index. 3) On erasure request: delete all embeddings tagged with user_id from vector DB, delete raw text from any backing store, delete from deletion index, generate erasure confirmation log with timestamp. Moltbot's erasure API handles all of this: moltbot.memory.erase_user(user_id='u123', confirm=True).
Can prompt injection via retrieved memory be fully prevented?
Not 100%, but risk can be reduced to near-zero: 1) Scan every retrieved memory chunk with an injection detection model before including in prompt. 2) Separate trusted memory (agent-written) from untrusted memory (user-sourced) using different namespaces with different trust levels. 3) Use structured memory (key-value facts) instead of raw text where possible — much harder to inject into. 4) Limit retrieved memory context to 20% of total prompt to reduce injection surface. 5) Run memory-retrieved prompts through a separate safety classifier before execution.