Processing personal data in AI models: an Opnion from the EDPB

Last year, the European Data Protection Board (EDPB) adopted an Opinion which examines key data protection issues arising from the use of AI models, with a particular focus on how personal data is processed during the development and deployment phases. The Opnion establishes that AI models trained on personal data cannot automatically be deemed anonymous. Instead, each case must be rigorously assessed based on the risk of extracting personal data directly or indirectly from the model. This assessment should include both the potential for deliberate extraction (through targeted attacks such as membership inference or model inversion) and inadvertent leakage via user queries.

The document outlines a structured approach for supervisory authorities (SAs) to evaluate claims of anonymity. SAs are urged to consider technical measures employed during model design, such as data minimization strategies, privacy-preserving techniques (e.g., differential privacy), and robust testing against state-of-the-art extraction methods. Documentation plays a crucial role; controllers must provide detailed records covering data selection, preparation, training methodologies, and the security controls implemented at every phase.

In addition, the Opinion addresses the use of legitimate interest as a legal basis for processing personal data in AI development and deployment. It reinforces the necessity of a three-step test for legitimate interest:

(1) clearly articulating a lawful, precise, and tangible interest;

(2) demonstrating that the data processing is necessary for the intended purpose; and

(3) ensuring that the interest is not overridden by the rights and freedoms of the data subjects.

The assessment must also factor in data minimization and transparency requirements under GDPR.

Key recommendations include:

• Implementing robust anonymization and privacy-preserving measures during the development phase.

• Maintain thorough and accessible documentation to demonstrate compliance with GDPR.

• Conduct regular testing and audits against emerging extraction techniques and re-identification risks.

• Clearly communicating data processing purposes and measures to data subjects, ensuring transparency and accountability throughout the AI model lifecycle.

These recommendations aim to promote responsible innovation while safeguarding data protection rights in an increasingly complex AI landscape.