I just finished a very interesting discussion with Mark Thompson (Director of Research and Insight, IAPP) and William Malcolm, (Director, Privacy Legal, Google) on Future trends in Privacy Enhancing Technologies (PET). During the conference, we discussed the intricacies of Differential Privacy, Federated Learning, Homomorphic Encryption, Synthetic Data, Trusted Enclaves, and the Privacy Sandbox, etc. But you may be asking, what exactly are PETs?
PETs are technologies that are built to support privacy and data protection by embodying data protection principles such as data minimization, pseudonymization, and anonymisation. Now that this is clear, let’s take a closer look at the definitions of some PETs.
Federated Learning: ENISA defines federated learning as “a set of training techniques that trains a model on several decentraliser servers containing local data samples, without exchanging their data samples. This avoids the need to transfer the data and/or entrust it to an untrusted third party and thus helps to preserve the privacy of the data.” Learn more about how federated learning can help reduce data breaches in ENISA’s December 2021 publication entitled “Securing machine learning algorithms”.
Homomorphic Encryption: is defined by ENISA as “a building block for many privacy enhancing technologies like secure multi-party computation, private data aggregation, pseudonymisation or federated machine learning to name a few. Homomorphic encryption allows computations on encrypted data to be performed, without having to decrypt them first. The typical use case for homomorphic encryption is when a data subject wants to outsource the processing of her personal data without revealing the personal data in plaintext. It is apparent that such functionalities are very well suited when processing is performed by a third party such as a cloud service provider.” Read more about Homomorphic Encryption in ENISA’s January 2022 report “Data Protection Engineering from Theory to Practice”.
Synthetic Data: is just what its name says – data that has been generated synthetically! This means that a dataset is used to create artificial, new data with statistical properties that are similar to the original data set. As the EDPS explains, “Keeping the statistical properties means that anyone analysing the synthetic data, a data analyst for example, should be able to draw the same statistical conclusions from the analysis of a given dataset of synthetic data as he/she would if given the real (original) data.” Read more about the positive and potentially negative impacts of synthetic data on the EDPS website.
During our discussion, we agreed that Privacy by Innovation is a business requirement which is complementary to legislative regulations concerning data and more generally, the digital space. I also stressed that any discussion or project around Privacy by Innovation should not only take the legal and technical domains into consideration, but also reflect the commitment of an organization to process data in a socially responsible way: Data Protection as CSR! See the Framework here.
Many thanks to BSI for having us!