On May 25th, 2018 a new privacy law took effect
in Europe. The GDPR or General Data Protection Regulation
and it gives EU citizens the control over who is allowed to collect their personal data
and over what happens with it. It’s the reason why you are bombarded with
these popups asking your permission to gather and process your personal data. But also why email newsletters are asking
you to confirm if you’re still interested in them and why companies are suddenly making
it more easy to grab a copy of the data that they have on you. Companies from all around the world are working
quickly to make sure they comply with the new regulations because otherwise, they’ll
face hefty fines. But what about Blockchain technology? If you’ve been following my channel, you
probably came across the video in which I explain what a blockchain is and how it works. If you watched it, you know that data on a
blockchain is recorded in an open and transparent way. What’s more: data stored on a Blockchain
cannot be changed or erased. These properties are what allows a blockchain
to be completely distributed without a central authority. But at the same time, these properties don’t
sound so good when it comes to privacy. So that got me thinking: does the GDPR kill
blockchains? Well, let’s first explore some terminology
surrounding the GDPR. The EU calls companies that store your data,
“data controllers” and those that work with your data to analyze it, for instance,
are called “data processors”. In most cases, the data controller is also
the data processor, but they could also be different companies. It is the data controller that is responsible
for complying with the GDPR. We also have to realize that the GDPR is only
applicable when we’re talking about personal data of EU citizens. Any company storing any personal information
of EU citizens should follow the regulation. Even foreign companies like Facebook or Apple
have to adhere to the law because they have European users. But, hold on, what qualifies as personal data? Well, the law states that it means “any
information relating to an identified or identifiable natural person”. And that is actually quite broad! Let me explain… Personal data sounds quite straightforward. You might think of your name, age or gender
as personal information. And that’s correct. But what about your phone number? Or credit card number? Or your computer’s IP address? Are these personal data? These are just random numbers that we can’t
link directly to a person. But an internet service provider can see what
customer is using which IP address. The same thing applies to your Bitcoin wallet
address. It’s a random string of letters and numbers
that can’t be linked to a person. It can, however, be indirectly linked to you
if you bought some Bitcoin using your credit card or through an exchange. So in short: even random numbers and letters
can qualify as personal data if they can be linked to a specific person. So now that we know what the terms “data
controller”, “data processor” and “personal data” means, let’s take a look at how
it conflicts with Blockchains. There are three articles in the GDPR that
are problematic. Article 16 which is about the right to rectification,
article 17 about the right to be forgotten and article 18 about the restriction of processing. Let’s start with article 16 which gives
you the right to correct the data that someone has about you. Not only can you change existing data that
they have on you but you can also add new data if you feel that the current data is
inaccurate or incomplete. Adding new data to a Blockchain is not a problem,
but changing data is! The same applies to Article 17, the right
to be forgotten. Not being able to remove data from a blockchain
means you can’t exercise your right to delete your data. Which means that Blockchains can’t comply
with the GDPR, therefore they can’t store personal data of EU citizens. And finally, we have Article 18 which gives
you to right to prevent companies from doing something with your data. You can do this when the data is inaccurate
or if it was unlawfully collected. The problem here is that most blockchains
are completely open, allowing anyone to grab a copy of all the data and doing anything
they want with it. So you have no control over who is processing
your data. So how can a blockchain get around these issues? Well, let’s take a look. The first possible solution would be to encrypt
personal data before storing it on a blockchain. And this is where the law gets a bit hazy. Using strong encryption means that only the
person or company with the decryption key can actually do something with that data. To delete the data, all you have to do is
destroy the key and, in theory, the encrypted data becomes useless. At least that is how they view things in the
UK. Others argue that strong encryption is still
reversible. As our computers get faster over time, it’s
more likely that the encryption can be broken and reveal the personal data again. Maybe not such a good solution after all. A better solution would be to store the personal
data in a permissioned blockchain instead of a public one. What’s the difference? Well, public blockchains allow anyone to see
the data that’s stored inside of them and to put new data on the chain. Permissioned blockchains are very different:
access is controlled and restricted to only a few known and trusted parties. By doing this we can comply with article 18
of the GDPR: the right to restrict who can process your data. But a permissioned blockchain is still immutable,
meaning we can’t edit or delete data and thus we can’t comply with article 16 and
17. Also not a real solution to this problem… A real solution would be to simply store the
personal data somewhere else, somewhere where we have read and write access. Let’s say a secure server. Then we can store a reference to that data
on our blockchain. Almost like a shortcut or pointer. To create this link we make a digital fingerprint
of our data using a hash function. And then we store that hash on the blockchain. Why use a hash? Well because it has 2 interesting properties. First of all: hashes work in one way, meaning
you can create the hash of some data but you can’t take the hash and turn it back into
that data. And secondly, a hash function allows us to
verify that the files on the central server haven’t been tampered with. An important property if we’re moving to
this model! The hash stored inside the blockchain is just
a string of random letters and numbers but it qualifies as personal data because it can
be linked to the data on the server. If we now want to exercise our right to be
forgotten, we just remove the actual data from the central server. The hash in our blockchain now becomes useless
and is no longer considered “personal data“ because it points towards nothing. However, this solution isn’t perfect because
blockchains are decentralized and with a system like this you would partially centralize it
again. Finally, we have solution 4: zero-knowledge
proof. This is a technology that allows you to prove
that something is true, without revealing the actual data. In case of a cryptocurrency: you can prove
that a transaction happened without disclosing how much money you transferred or to whom. This technology is used by Zcash to let users
completely hide their transactions. Let’s imagine a simple example: before entering
a bar you have to prove that you’re over 21 years old. You could do that by showing your identity
card with your date of birth, but then you revealed more information than necessary. With zero knowledge proof, you can deliver
the proof that you’re over 21, without giving away your actual age or your date of birth. That way people can only reveal the absolute
minimum amount of data about themselves. Interesting stuff! Leave a comment below if I should make a video
about Zero Knowledge Proof in the future! So these are some solutions that would make
a blockchain compatible with the GDPR. But let’s assume that we don’t care and
we store personal data in a blockchain anyway. Then we have an interesting legal issue. The law states that the company who stores
personal data, the data controller, is responsible for obeying the law. But who is the data controller in a blockchain? Who should be held responsible? Is it everyone participating in the network? Is it the people who create and verify blocks? Or maybe the people who develop the protocols
and write the code? You can’t blame everyone participating in
the network because they have no control over what others store on the blockchain. Then we can hold the people who make blocks
responsible right? Well, not really because they might not know
that the data they get is personal. And that leaves us with the people who develop
the blockchain protocols. We can’t hold these responsible either because
they only develop a tool. Punishing them would be like closing hammer
factories because their tools can be used to commit violent acts. So clearly we have some work to do. The GDPR is a great way to protect the privacy
of EU citizens but it leaves us with some questions as to how it can be applied to Blockchains. We’ve seen a few ways in which blockchains
can be adapted to conform to the law but some of these aren’t desirable. While others like the central server or zero
knowledge proof would be offer a solution. So that was it for this video. Remember to give it a thumbs up and get subscribed
if you want to see more videos like this one. Thank you very much for watching and I’ll
see you in the next video!