"Not a Pentest" Notice: GDPR compliance guide for your own AI systems. Not legal advice.

Solutions · Batch 7

GDPR & AI: Privacy-Compliant AI Data Processing

AI systems create new GDPR challenges: legal basis for LLM training, erasure rights in vector databases and DPIA for high-risk AI. Six article mappings with concrete implementation measures.

6 GDPR Articles for AI Systems

Art. 6Legal Basis for AI Data Processing

Every AI system processing personal data needs a specific legal basis. The most applicable for enterprise AI: Legitimate interests (Art. 6(1)(f)): Internal AI tools processing employee data for productivity — conduct LIA (Legitimate Interest Assessment) documenting the balance test. Contract performance (Art. 6(1)(b)): Customer-facing AI that processes data to fulfill a service contract. Consent (Art. 6(1)(a)): Consumer AI chatbots processing personal conversations — obtain explicit consent, allow withdrawal. Avoid: claiming LI for customer-facing AI where the processing is unexpected or highly intrusive. Document: Moltbot's processing activities register should capture the legal basis per processing activity.

Art. 13/14Transparency & Information Obligations

Users interacting with AI systems must be informed: At point of collection: that an AI system processes their data, what data is processed and why, retention periods, any third-party LLM providers. For AI-specific transparency: if the AI makes automated decisions with significant effects (Art. 22), users have the right to: human review, explanation of the logic, contest the decision. Implementation: add AI disclosure to privacy notice, include in chatbot welcome message, document in DPA with cloud LLM providers. Moltbot: configure disclosure banner on all AI interaction interfaces.

Art. 17Right to Erasure in AI Systems

The right to erasure creates specific challenges for AI: Conversation history: delete all stored conversation logs containing that user's data. RAG corpus: if documents containing the user's personal data were indexed, remove and re-index without that content. Fine-tuning data: if the model was fine-tuned on data containing the user's information — complex, may require model retraining. LLM memorization: foundation models may have memorized training data — no technical erasure possible (document this limitation in the privacy notice). Practical approach: keep detailed data lineage — know exactly where personal data enters AI systems. Implement erasure workflows for: conversation stores, vector databases, fine-tuning datasets. Moltbot's erasure API automates conversation and RAG corpus deletion.

Art. 25Privacy by Design for AI Systems

Build privacy into AI system architecture from day one: Data minimization: configure AI to process only minimum required data (don't log full conversations if session hash is sufficient). Storage limitation: define and enforce retention periods for all AI data stores. Access control: minimum privilege for all components that touch personal data. Separation: AI audit logs should contain metadata (hashes) not raw personal data. Pseudonymization: replace user identifiers with pseudonyms in AI training data. Default settings: opt-out of data retention by default (opt-in for personalization features). Architecture review: DPIA (Art. 35) required before deploying high-risk AI systems.

Art. 28Data Processing Agreements for LLM Providers

Using cloud LLM APIs (OpenAI, Anthropic, Azure OpenAI) requires a DPA with each provider. Key DPA clauses to verify: Data not used for training: confirm in the DPA that your prompts and outputs are not used to train future models (all major providers offer this via API). Data residency: confirm EU data stays in EU (requires EU-region API endpoints). Sub-processor list: obtain and review the provider's sub-processor list. Data retention: confirm the provider's retention period for your data. Breach notification: confirm the provider will notify you within 72 hours of a breach affecting your data. Action: download and sign DPAs with all cloud LLM providers before going live. Store DPAs with your vendor management documentation.

Art. 35DPIA for High-Risk AI Processing

A Data Protection Impact Assessment is mandatory before deploying AI that: processes personal data at large scale, uses profiling or automated decision-making, processes sensitive data (health, biometric, financial), systematically monitors individuals. DPIA components for AI: Description: what data, what AI model, what processing purpose. Necessity and proportionality: is AI necessary or could less privacy-invasive means achieve the goal? Risk assessment: data breach, discrimination, loss of control, model hallucination creating false records. Mitigation measures: encryption, access control, human oversight, model output validation. Residual risks: after mitigations, document remaining risks and rationale for proceeding. Consult DPA: if residual risks are high, consult your national data protection authority before deployment.

Frequently Asked Questions

Can I use personal data to fine-tune AI models under GDPR?

Yes, but with significant constraints: Legal basis: you need a valid GDPR legal basis (most likely: legitimate interest with LIA, or consent). Consent is required if the data subjects would not reasonably expect their data to be used for AI training. Purpose limitation: the fine-tuning purpose must be compatible with the original collection purpose. If data was collected for customer service and you fine-tune an AI on it, document the compatibility assessment. Data minimization: fine-tune on the minimum data necessary — pseudonymize or anonymize where possible. Erasure: implement a process to remove a data subject's contribution from the training dataset on erasure request — this may require model retraining. Data retention: define how long you retain training datasets and implement deletion. Safe option: fine-tune only on synthetic or fully anonymized data — eliminates most GDPR complexity.

Do I need to disclose to users that they are interacting with an AI?

Under GDPR Art. 13/14 (transparency) and increasingly under national AI transparency laws: Yes — you must disclose AI processing in the privacy notice, including the purpose and legal basis. For automated decision-making with significant effects (credit scoring, hiring, medical diagnosis): Art. 22 applies — inform users of the logic, right to human review, and right to contest. EU AI Act (from 2026): chatbots must disclose they are AI systems unless the context makes it obvious. Best practice: add a clear disclosure at the start of every AI interaction ('You are chatting with an AI assistant'), include in privacy notice with specifics (which model, which provider, what data is processed), document the disclosure in your records of processing.

What is the legal basis for using employee data in AI systems?

Employee data in AI systems requires careful legal basis selection: Internal productivity tools (summarization, drafting): Legitimate interest (Art. 6(1)(f)) — document the LIA showing business need outweighs privacy impact. Implement: minimize data logged, short retention, inform employees via works council/employee handbook. Performance monitoring AI: in most EU jurisdictions, this requires works council agreement (Betriebsrat in Germany — §87 BetrVG). Legitimate interest alone is insufficient for systematic employee monitoring. Consent: generally not appropriate as a legal basis for employee processing — employees cannot freely consent to employer data processing (power imbalance). Contractual necessity: AI tools that are genuinely necessary for the employment contract (e.g., remote work monitoring tools explicitly in employment contract).

How do I handle GDPR data subject access requests (DSARs) for AI systems?

DSARs for AI systems require extending your DSAR process: Identify all AI data stores containing personal data: conversation logs, vector store entries (RAG corpus with personal info), fine-tuning datasets, AI audit logs. Provide in DSAR response: what personal data is processed by AI systems, the purposes and legal basis, retention periods, any automated decisions made about the data subject and their logic. Technical challenges: conversation logs: searchable by user_id — manageable. RAG corpus: may contain the data subject's documents — requires search by name/identifier. Fine-tuning data: may include their communications — requires dataset audit. Model memorization: you cannot provide model weights — document in DSAR response that the model may have been trained on publicly available data. Timeline: 30 days to respond (extendable to 90 days for complex requests). Use Moltbot's data lineage tools to map all personal data flows through AI systems.

Further Resources

EU AI Act Compliance

GDPR + EU AI Act synergy

DSGVO Compliance Automation

Automated GDPR controls

LLM Context Isolation

Data separation in AI

Agent Memory Security

GDPR erasure in AI memory