GDPR Has Always Required Data Integrity. AI Makes It Impossible to Ignore.
GDPR has required data integrity since 2018. Article 5(1)(f) mandates 'appropriate security of the personal data, including protection against unauthorised or unlawful processing and against accidental loss, destruction or damage.' Article 5(1)(d) requires that personal data be 'accurate and, where necessary, kept up to date.'
For most organisations, these requirements were interpreted as access control and backup policies. Strong passwords, role-based access, regular backups. Technically compliant. Practically insufficient.
AI changes this calculus. When personal data feeds into AI training pipelines, recommendation engines, automated decision systems, or HR screening tools, the accuracy principle is no longer passive. A single inaccurate or tampered record in a training dataset can propagate systematically across thousands of decisions, each of which may be individually challengeable under GDPR Article 22.
Data Protection Authorities across the EU are beginning to make this connection.
The Article 5 Requirements That AI Deployments Activate
Three Article 5 principles become materially harder to satisfy when personal data enters AI pipelines:
Accuracy (Article 5(1)(d)). Personal data must be accurate. In AI contexts, this means not just that the source data was accurate when collected, but that it remained accurate and unmodified through every stage of the AI pipeline. An individual has the right to rectification under Article 16 - but if the model has already been trained on inaccurate data, rectification of the source record does not undo the model's learned patterns.
Storage limitation (Article 5(1)(e)). Data should be kept for no longer than necessary. AI training datasets often contain personal data that was collected for a different purpose and retained beyond the original storage period. If that data is still in a training corpus, it may be a storage limitation violation.
Integrity and confidentiality (Article 5(1)(f)). The 'integrity' in Article 5(1)(f) is commonly read as confidentiality-focused, but supervisory authorities are increasingly interpreting it to require verifiable integrity - the ability to demonstrate that data has not been accidentally or deliberately altered.
What Compliance Actually Requires
Meeting Article 5's requirements for AI systems that process personal data requires capabilities most organisations do not currently have:
Auditability of training data composition. You must be able to demonstrate, for any model trained on personal data, exactly which records were included, when they were included, and whether they were still within their retention period at the time of training.
Tamper-evidence throughout the pipeline. The data at the point of model training must be demonstrably the same data that was captured and verified. A cryptographic audit trail from source to training run satisfies this requirement; a log file in a mutable system does not.
Response capability for data subject requests. When an individual exercises their right to rectification or erasure, you must be able to identify all AI systems trained on their data and assess the impact. Without a complete, integrity-verified record of training data composition, this is practically impossible.
ROOTKey's GDPR-aligned data integrity infrastructure provides the technical foundation for all three requirements. See how facilitated compliance audits work in practice, and start a free compliance assessment.
Erhalten Sie Einblicke zur Cyber-Resilienz per E-Mail
Praktische, auditfähige Hinweise zu Datenintegrität, Compliance und Kontinuität – sobald wir veröffentlichen.





