Crypto Slo cryptocurrency news and
welcome to the first lecture in our
series on Bitcoin and crypto currencies
I want to start by introducing the four
lecturers who are going to be speaking
in the series the first lecturer is
Joseph be no he’s a postdoctoral
researcher in computer science at
Princeton University
the second lecturer is me Edie Felton
I’m a professor at Princeton in computer
science and in the Woodrow Wilson School
the third lecture is Arvind Narayan and
he’s a computer science professor at
Princeton and fourth our special guest
lecturers Andrew Miller he’s a PhD
student in computer science at the
University of Maryland there will be 11
lectures in total in this lecture number
one we’re going to do two things first
we’ll introduce some cryptographic
primitives that turn out to be necessary
for talking about crypto currencies in
particular we’ll talk about
cryptographic hashes and digital
signatures and we’ll talk about some of
the ways in which those are used to
build crypto currencies and then at the
end of the lecture we’ll start talking
about crypto currencies and I’ll give
some examples of simple crypto
currencies that illustrate some of the
design challenges that we need to deal
with I want to apologize for covering
the cryptographic material at the
beginning unfortunately we have to eat
some of our vegetables a little bit in
order to let to lay groundwork for the
crypto currency stuff so if you came for
the crypto currency stuff let me assure
you first of all that we will get to it
in this lecture and that having laid the
groundwork in this lecture there’s going
to be a lot more specifically crypto
currency focused material in later
lectures all right so let’s get to it
in segment 1.1 we’re going to talk about
cryptographic hash functions we’ll talk
about what they are and what their
properties are and then later we’ll move
on and talk about what their
applications are so a cryptographic hash
function is a mathematical function and
it has three attributes that we need to
start with first of all a hash function
can take any string as input absolutely
any string of any size it produces a
fixed size output will use 256 bits in
this series of lectures because that’s
what Bitcoin does and it has to be
efficiently computable meaning given a
string it in a reasonable length of time
you can figure out what the output is so
that’s a hash function but we’re going
to need hash functions that are
cryptographically secure the the
cryptographic properties of hash
functions are a complicated topic in
general but we’re going to focus here on
three particular properties and I’ll
explain in a minute what those are in
particular that the function is
collision free that it has a hiding
property and that it’s puzzle friendly
and for each of these I’ll talk about
what the property is what it means and
then I’ll talk about why it’s useful to
have a function that has that property
so first collision-free so the first
property that we need from a
cryptographic hash function is that it’s
collision free and what that means is
that it’s impossible nobody can find
values x and y such that x and y are
different and yet the hash of x is equal
to the hash of y and so if we look at
the operation of the function as
depicted by one of these red arrows
here’s X and H of X and here’s Y and H
of Y that nobody can find a situation
like this that you have an x and y that
are separate and yet when you hash them
they hash to the same value now one
thing to notice is that I said nobody
can find I didn’t say that there is no
collision because if you think about it
there has to be a collision collisions
do exist and to understand why that is
we can we can use this diagram over here
on the Left I’m depicting all of the
possible inputs to this function which
can be a string of any size and over
here I have all of the possible outputs
which has to be a string of 256 bits in
size so the right-hand side here the
outputs there are only two to the 256
possibilities over here there are more
so if you think that every point over
here on the left is going to be mapped
by an arrow to some point on the right
you can see that as you go from all the
points over here on the left into the
right it has to get crowded and in fact
that there have to be multiple values
over here on the left that mapped to the
same output over here in fact in general
there will be a very large number of
possible inputs that map to any
particular output so collisions do exist
I said before nobody can find a
collision and that’s the key question
we know collisions exist the question is
are there any collisions that are
findable by regular people using regular
computers okay now to make things even
worse I said that it has to be
impossible to find a collision let me
tell you how to find a collision because
there’s a method that’s guaranteed to
work and the method works like this that
we’re going to pick two to the 130
randomly chosen inputs over on the left
cloud of that previous diagram and if we
pick those two to the 130 randomly
chosen inputs it turns out there’s a
ninety-nine point eight percent chance
that at least two of them are going to
collide and so this is a very simple
method for finding a collision it works
no matter what the hash function is but
of course the problem is that this takes
a very very long time to do you have to
compute the hash function to to the one
hundred and thirty times and and that’s
of course an astronomical number this
method works no matter which hash
function we’re using there’s still a
ninety-nine point eight percent
probability that this works and if it
doesn’t work just try it again it’ll
probably work the next time but this
doesn’t really matter and the reason it
doesn’t really matter is that this
procedure takes two to the 130 steps in
order to get to that high probability so
we can say something like this we can
say that if every computer every mate
ever made by humanity was computing
since the in beginning of the entire
universe up to now the odds that they
would have found a collision is still
infinitesimally small so small that it’s
way less than the odds that the earth
will be destroyed by a giant meteor in
the next two seconds which didn’t happen
okay so we know how to find a collision
but this method takes too long to matter
the question is is there some other
method that could be used on a
particular hash function in order to
find a collision and that’s the question
that is harder to to answer is there a
faster way to find collisions well for
some possible values of hash functions
of course there are for example if our
hash function were to simply take the
input modulo two to the 256 that is it
just selected the last 256 bits of the
input then we would know an easy way to
find a collision one collision would be
the values 3 + 3 + 2 to the 256 so for
some possible values of the hash
function it’s very easy to find a
collision for others we don’t know now
one thing I need to note is that there’s
no hash function in existence which has
been proven to be collision free there
are just some that people have tried
really really hard to find collisions
and haven’t succeeded and so we choose
to believe that those are collision free
ok now what good does collision freedom
do us if we can assume that we have a
hash function that is collision free
then we can use that hash function as a
message digest and what I mean by that
is the following that if we know that x
and y have the same hash then it’s safe
to assume that x and y are different
because if someone knew an x and y that
were different that had the same hash of
course that would be a collision since
there’s not a collision that we know of
then knowing the hashes are the same we
can assume that the values are the same
and this lets us use the hash as a kind
of message digest suppose for example
that we had a file a really big file and
we wanted to be able to recognize later
whether another file was the same as the
file we saw the first time right so one
way to do that would be to save the
whole big file and then when we saw
another file later just compare them but
because we have hashes that we believe
are collision free it’s more efficient
to just remember the hash of the
original file then if someone shows us a
new file and claims that it’s the same
we can compute the hash of that new file
and compare the hashes if the hashes are
the same then we conclude that the files
must have been the same and that gives
us a very efficient way to remember
things we’ve seen before and recognize
them again and of course this is useful
because the hash is small it’s only 2
256 bits while the original file might
be really big so hash is useful as a
digest and we’ll see later on in this
lecture and in subsequent lectures why
it’s useful to use hashes of message
digests so the second property that we
want from our hash function is that it’s
hiding and the property that we want is
something like this that if we’re given
the output of the hash function that
there’s no feasible way to figure out
what the input X was the problem is that
this property doesn’t exactly hold and
to understand why that’s the case let’s
look at this example so here what we’re
going to do is an experiment where we
flip a coin and if the result of the
coin flip was heads we’re going to
return the hash of the string heads and
if the result was tails we’re going to
return the hash of the string tails and
now we’re going to ask someone who
didn’t see the coin flip but only saw
this hash output to figure out what the
string was that was hashed that of
course is going to be easy it’s easy in
this scenario to find what the input
string was it’s easy to find X you
simply compute the hash of the string
heads and the hash of the string tails
and you see which one you got and so in
just a couple of steps you can figure
out what X was so the reason this
example failed that is the reason why an
adversary was able to guess what the
string was was that there were only a
couple of possible values of X and so if
we’re going to have a property like this
it needs to be the case that there’s no
value of X which is particularly likely
that is all the X has to be chosen from
a set that’s in some sense very spread
out so this method for the adversary of
just trying all the possible values of X
or just trying a few values of X that
are especially likely is not going to
work so the hiding property that we are
going to need to set up is a little bit
more complicated and the way we’re going
to fix this problem with the common
value X like heads and tails is we’re
going to take the X and we’re going to
put next to it we’re going to
concatenate with it with it a value R
which is chosen from a distribution
that’s really spread out and so this H
of R concatenated with X that means take
all the bits of R and put after them all
the bits of X and so what we’re going to
say is given the hash of our two
with ex that it’s infeasible to find X
and that this will be true in the
formally stated property that if R is a
random value chosen from a distribution
that has high min entropy then given the
hash of our concatenated with X it’s
infeasible to find X and what is time me
an entropy mean well it captures this
intuitive idea that R is chosen from a
distribution that’s really spread out
and what that means specifically is that
there’s no particular value that r could
have had that would occur with more than
a negligible probability so for example
if R is chosen uniformly from among all
of the strings that are 256 bits long
then any particular string was chosen
with probability 1 in 2 to the 256 which
is truly a negligible value so as long
as R was chosen that way then the hash
of our concatenated with X is going to
hide X and that’s the hiding property
that the hash function will be deemed to
have okay now let’s look at an
application of that hiding property and
in particular what we want to do is
something called a commitment and this
is kind of the digital analogy of taking
a value a number and sealing it in an
envelope and putting that envelope out
on the table where everyone can see it
now when you do that you’ve committed to
what’s in the envelope but you haven’t
opened it it’s secret from everyone else
later you can open the envelope and get
out the value but it’s sealed so commit
to a value and reveal it later we want
to do that in a digital sense so to be
more specific about what is the API that
we’re going to provide here the
commitment API looks like this that
there are two things you can do first
you can commit to a message and that’s
going to return two values a commitment
and a key think of the commitment as the
envelope that you’re going to put on the
table and the key as a secret key for
unlocking the envelope then later you
allow someone else to verify given a
commitment and given a key which you’ve
told them in the meantime and the
message they can verify that that
commitment key and message really do go
together and this will return true or
okay now to seal MSG in an envelope what
we do is we commit to the message and
that returns a commitment and a key and
then we publish the commitment that’s
putting the envelope on the table now
later to open the envelope what we’re
going to do is Pablo
the key and the message that we
committed to and then anybody can use
this verify call with the commitment
that we published previously the key and
message that we just announced to check
the validity of of our opening the
envelope okay and the property of course
we want from this is that it behaves
like sealing an envelope and in
particularly the two security properties
are these first give and calm the
commitment the envelope on the table
that someone just looking at the
envelope can’t figure out what the
message is the second property is that
it’s binding that when you commit to
what’s in the envelope you can’t change
your mind later that is it’s infeasible
to find two different messages such that
you can commit to one message and then
later claim that you committed to
another and the whole thing will verify
okay so how do we know that these two
properties hold well first we need to
talk about how we’re going to actually
implement commitments and the way we’re
going to implement commitments is like
this that in order to commit to a value
message we’re going to generate a random
256 value bit value and call it the key
and then we’re going to as the
commitment return the hash of the key
concatenated together with the message
and as and as the key value we’re going
to return h of this of the of this key
and then later to verify someone is
going to compute this same hash of the
key they were given concatenated with
the message and they’re going to check
whether that’s equal to the commitment
that they saw okay so this is a way of
using hash functions both in the
commitment and in the verification so
now the security properties if we go
down to the security properties that
were at the bottom of the previous slide
and we just plug in the definitions of
how we’re going to implement this here
that is this used to say comm given comm
infeasible defined message we just plug
in what comm is comm is that hash of key
concatenated with message and similarly
down here this is what happens when we
take what was written there before and
plug in the definition of verify income
okay so now what these properties become
the first one is given H of key
concatenated with message it’s
infeasible to find message well it turns
out that that’s exactly the hiding
property that we talked about before key
was chosen as a 2 random 256 bit value
and therefore the the hiding property
says that if we take the message and we
put before it’s something that was
chosen from a very spread out
distribution like I said a random 256
bit value then it’s infeasible to find
the message so this is exactly the
hiding property and this one down here
turns out to be exactly the the
collision-free property so that if
someone can find two messages which have
the same hash like this well then they
have an input value here and an input
value there that are different and yet
does have the same hash and so because
of the two security properties we’ve
talked about for hashes so far this
commitment scheme will work in the sense
that it will have the necessary security
properties okay so that’s the second
security property of hashes that they’re
hiding and the application of that is
commitments the third security property
we’re going to need is that their puzzle
friendly and this is again a little bit
more complicated but let me just go
through it a bit by bit that for any
possible output value Y that you might
want from the hash function we’re going
to use Y as an output value of the hash
later that if K is chosen from a
distribution that has high min entropy
that is K is chosen randomly from some
set that’s super spread out then there’s
no way to find an X such that the hash
of K and X is equal to Y so what this
means is basically that if someone wants
to target the hash function if they want
it to come out to some particular output
value Y that if there’s part of the
input that is chosen in a suitably
randomized way that it’s very difficult
to find another value that hits exactly
that target so the application we’re
going to use of this is we’re going to
build a search puzzle and what that
means is we’re going to build a
mathematical problem which requires
searching a very large space in order to
find the solution and where there’s no
shortcuts a way to find a good solution
other than searching that large space
that’s that’s a search puzzle to be more
specific the idea is that if we’re given
a puzzle id which is chosen from some
high min entropy distribution that is
some very spread out probability
distribution and we’re given a target
set Y which someone try
wants to make the hash function fall
into then we want to try to find a
solution X so that if we hash the puzzle
ID together with the solution X we get a
result that’s in the set Y so the idea
is y is a is a target range or set of
hash results that we want ID specifies a
particular puzzle and X is a solution to
the puzzle and the puzzle friendly
property here implies that there’s no
solving strategy for this puzzle which
is much better than just trying random
values of X and so if we want to if we
want to pose a puzzle that’s difficult
to solve that we can do it this way as
long as we can generate puzzle IDs in a
suitably random way and we’re going to
use that later when we talk about
Bitcoin mining that’s a sort of
computational puzzle we’re going to use
okay so we’ve talked about three
properties of hash functions and one
application of each of those now let me
talk just very briefly about the
particular hash function we’re going to
use there are lots of hash functions in
existence but this is the one Bitcoin
uses and it’s a pretty good one to use
it’s called sha 256 or sha 256 and it
works like this
basically it takes the message that
you’re hashing and it breaks it up into
blocks that are 512 bits in size the
message isn’t going to be in general
necessarily exactly a multiple of the
block size so we’re going to add some
padding at the end and the padding is
going to consist of at the end of the
padding a 64 bit length field which is
the length of the message in bits and
then before that it’s going to consist
of a 1 bit followed by some number of 0
bits and you choose the number of 0 bits
so that this comes out exactly 2 to the
end of a block so once you’ve padded the
message so that its length is exactly a
multiple of the 512 bit block size you
then chop it up into blocks and you then
execute this computation you start with
the 256 bit value called the IV that’s
just a number that you look up in a
standards document and then you take the
IV and the first block of the message
you take those 768 total bits and you
run them through this special function C
and the compression function and out
comes 256 bits you now take that with
the next 512 bits of the message run it
through C again and you keep going each
iteration of C crunches
another 512 bit block of the message and
mixes it in sort of logically to the to
the to the result and when you get to
the very end you have consumed all of
the blocks of the message plus the
padding the result is the hash that’s a
256 bit value and it’s easy to show that
if this function see this compression
function is collision free then this
entire hash function will also be
collision free the other properties are
a little bit more complicated so I won’t
talk about them here okay so we’ve
talked about hash functions we’ve talked
about what hash functions do we’ve
talked about three properties of hash
functions and applications of those
properties and the specific hash
function that we use in Bitcoin in the
next lecture segment we’ll talk about
ways of using hash functions to build
more complicated data structures that
are used in distributed systems like
Bitcoin in section 1.2 we’re going to
talk about hash pointers and their
application a hash pointer is a kind of
data structure that turns out to be used
a lot in the systems that we’re talking
about and a hash pointer is basically a
simple thing that we’re going to take a
pointer to where some information is
stored and we’re going to together with
that pointer store a cryptographic hash
of the information so whereas a regular
pointer gives you a way to retrieve the
information a hash pointer is going to
let us ask to get the information back
and it’s also going to let us verify
that the information hasn’t changed so a
hash pointer tells us where something is
and what its value was okay and we’re
going to draw a hash pointer in diagrams
like this then we’re going to have we’re
gonna have H of and then an arrow that
points to something
so anything drawn like this they think
of it as being a hash pointer to do this
this thing it’s a pointer to where it’s
stored and it’s also the hash of the
value that this data had when we last
saw it and we can take hash pointers and
we can use them to build all kinds of
data structures so a key idea here take
any data structure a linked list a
binary search tree or something like
that and implement it with hash pointers
instead of pointers as we normally would
for example here’s a linked list that
we’ve built with hash pointers and this
is a data structure that we’re going to
call a blockchain
so just like a regular linked list where
you have a series of blocks and each
block has data as well as a pointer to
the previous block in the list here the
previous block pointer will be replaced
with a hash pointer so it says where it
is and what the value of this entire
previous block was and we’re going to
store we’re going to remember the head
of the list like this just as a regular
hash pointer okay
and a use case for this for a blockchain
like this is a tamper evident log that
is if we want to build a log data
structure that stores a bunch of data so
that we can add data on to the end of
the log but if somebody goes later and
messes with data that is earlier in the
log we’re going to detect it that’s what
the tamper evidence means so to
understand why a blockchain gives us
this tamper-evident property let’s ask
what happens if an adversary wants to go
back and tamper with data later that’s
in the middle of the chain okay so let’s
assume that an adversary wants to tamper
with this block down here he wants to
change the data here and he wants to do
it in such a way that we the holders of
the of the hash pointer at the head here
won’t be able to detect it all right so
the adversary changed the contents of
this block and therefore the hash here
which is a hash of this entire block is
not going to match up because the hash
function is collision free it must be
the case that the hash of this block is
now different and so we could detect the
inconsistency between this data and the
hash pointer that we remembered before
or we could do that in unless the
adversary also tampers with the hash
pointer if he tampers with this hash
pointer then he makes these to match up
but now he’s changed the content of this
block and what that means is that when
we come back later and hash the contents
of this block it’s not going to match
the hash that we remembered before
because the contents of the block has
changed and so we’re going to detect the
inconsistency between this the contents
of this block and this hash unless the
adversary also tampers with the block
over here on the right but now when he
does that the hash of this block is not
going to match the hash that we
remembered up here and the hash that
we’re holding on to and this the
adversary can’t
tamper with because this is the value
that we remembered as being the head of
the list and so the upshot of this is
that if the adversary wants to tamper
with data anywhere in this entire chain
in order to keep the story consistent
he’s going to have to tamper with the
hash pointers all the way back to the
beginning and he’s ultimately going to
run into a roadblock because he won’t be
able to tamper with the head of the list
and so what this means is that just by
remembering this hash pointer we’ve
essentially remembered a kind of hash a
tamper-evident hash of the entire list
all the way back to the beginning and so
we can build a block chain like this
containing as many blocks as we want
going back to some special block at the
beginning of the list which we might
call the Genesis block and that’s a
tamper evident log built out of a
blockchain now another useful data
structure that we can build using hash
pointers is a binary tree we can build a
binary tree with hash pointers and this
is called in the jargon a Merkle tree
after ralph merkle who invented it and
the idea is this suppose we have a bunch
of data blocks which we’ll draw across
the bottom down here we’re going to take
pairs consecutive pairs of these data
blocks and for these two data blocks
we’re going to build a data structure
here that has two hash pointers one to
each of these blocks and similarly all
the way across will then go another
level up and this this block here will
will contain a hash pointer of these two
children down here and so on all the way
back up to the root of the tree and then
just like before we’re going to remember
just the hash pointer up here at the
head of the tree and we can then if we
want traverse down through the hash
pointers to any point in the list and we
can make sure that the data hasn’t been
tampered with because just like I showed
you with the with the blockchain if an
adversary tampers with some block down
here at the bottom with the data that
will change that will cause the hash
pointer that’s one level up to not match
so we’ll have to tamper with that and
therefore he’ll have to tamper with the
hash pointer one level up from there and
therefore he’ll have to tamper with the
hash pointer one level up from there and
eventually he’ll get up to the top where
he won’t be able to tamper with the hash
pointer that we’ve remembered so again
any attempt to tamper with any piece of
data across the bottom will be insured
against by just remembering the hash
pointer at the top now another nice
feature about
Merkle trees is that unlike the
blockchain that we built before that if
we want to if someone wants to prove to
us that a particular data block is a
member of this Merkle tree all they need
to show us is is this amount of data so
if we remember just the route and
someone wants to convince us that this
block is in the Merkel tree they need to
show us this block and we can verify
that the hash matches up and then they
need to show us this block and we can
verify that the hash of this matches
that they can show us this block we
verify that the hash of this block
matches this hash pointer and then they
show us the data and just by just by
verifying the hashes up to the root we
can make we can ensure we can verify
that this data block was in the Merkel
tree so that takes about log an items
that we need to be shown and it takes
about login time for us to verify it and
so with the very large number of blocks
and data blocks in the Merkel tree we
can still verify prove membership in a
relatively short time
okay so Merkel trees have various
advantages one advantage of course is
the tree holds many items but we just
need to remember the one root hash which
is only 256 bits we can verify
membership in a Merkel tree in
logarithmic time and logarithmic space
that’s nice and there’s a variant which
is a sorted Merkel tree that’s just a
Merkel tree where we take the blocks at
the bottom and we sort them into some
order say alphabetical lexicographic
order or numerical order or some order
that we agree on now having done that
once we’ve sorted the Merkel tree now
it’s possible to verify non membership
in a Merkel tree that is we can prove
that a particular block is not in the
Merkel tree and the way we do that is
simply by showing a path to the item
that’s just before where that up where
that item would be and just after where
it would be and then we can say look
both of these items are in the Merkel
tree they’re consecutive and therefore
there’s no space in between them there’s
nothing in between them and so the thing
that we’re trying to prove
of can’t be there all right so Merkel
tree is a binary search tree built with
hash pointers we can do logarithmic time
membership proofs non-membership proofs
if we sort the tree and it’s very
more generally it turns out that we can
use hash pointers in any pointer based
data structure as long as the data
structure doesn’t have cycles if there
are cycles in the data structure then we
won’t be able to make all the hashes
match up if you think about it in an a
cyclic data structure we can sort of
start near the leaves or near the the
things that don’t have any pointers
coming out of them compute the hashes of
those and then work our way back sort of
toward the beginning but in the
structure with cycles there’s no end
that we can start with and compute back
from so for example a directed acyclic
graph out of hash pointers and we’ll be
able to verify membership in that day
very efficiently and it will be easy to
compute so this is a general trick that
you’ll see over and over throughout the
distributed data structures and
throughout the algorithms that we talk
about later in this lecture and in
subsequent lectures
in segment 1.3 we’re going to talk about
digital signatures this is the second
cryptographic primitive along with hash
functions that we need as building
blocks for the cryptocurrency discussion
later on so a digital signature is
supposed to be just like a signature on
paper only in digital form and what that
means is this what we want from
signatures is two things
first that just like an idealized paper
signature only you can make your
but anyone who sees your signature can
verify that it’s valid and then the
second thing you want is that the
signature is tied to a particular
document so that somebody can’t take
your signature and snip it off one
document and glue it onto the bottom of
another one because the signature is not
just a signature it signifies your
agreement or endorsement of a particular
document okay so the question is how can
we build this in a digital form using
cryptography so let’s get into the nuts
and bolts here’s an API for digital
signatures there are three things three
operations that we need to be able to do
the first one is we need in the
beginning to be able to generate keys
and so we have a generate keys operation
and we tell it a key size how big in
bits should the keys be and this
produces two keys SK and piqué SK will
be a secret signing key this is
information you keep secret that you use
for making your signature and PK is a
public verification key that you’re
going to give to everybody and that
anybody can use to verify your signature
when they see it okay the second
operation is the sign operation so the
sign operation you take your secret
signing key and you take some message
that you want to put your signature on
and it returns sig which is a signature
it’s just some string of bits that
represents your signature and then the
third operation is a verify that takes
something that claims to be a valid
signature and verifies that it’s correct
it takes the public key of the signer it
takes the message that the signature is
supposedly on and it takes the supposed
signature and it just says yes or no is
this a valid signature okay so these
three operations these three algorithm
constitute a signature scheme and I’ll
note that the first two can be
randomized algorithms the verification
won’t be it will always be deterministic
and in fact if you think about it
generate keys had better be randomized
because it ought to be generating
different keys for different people okay
so the requirements for the signatures
at a slightly more technical level are
the following two requirements first of
all that if a signature is that valid
signatures will verify if a signature is
valid that is if I sign a message with
SK with the seek with my secret key that
if someone then later tries to validate
that using my public key and the same
message that that will that that will
validate correctly so this says that the
Signet that signatures are useful at all
but then the second thing you want is
that it’s impossible to forge signatures
that is an adversary who knows your
public key who knows your verification
key and gets to see signatures on some
other messages can’t forge your
signature on some message that he wants
to forge it on and and in order to
explain this property in a little bit
more detail it’s normally formulated in
terms of a sort of game that we play
with an adversary so the game I’ll
depict it here with this diagram so over
here on the left you have the Challenger
who’s a TV judge and the Challenger is
is going to test a claim by an attacker
the attacker claims that he can forge
signatures and we’re going to test that
claim and the judge will pass judgment
on it the attacker here this guy is
actually with dipping his one of the
inventors of digital signatures of the
concept of digital signatures and a
distinguished cryptographers so I
thought I’d let him play the attacker
role here okay so the game works like
this the first thing we do is we use
generate keys to generate a secret key a
secret signing key and a public
verification key that match up now we
give the secret key to the Challenger to
the judge and we give the public key to
both parties both to the Challenger and
to the attacker so the attacker only
knows information that’s public he only
knows the public key and his mission is
going to be to try to forge a message
the Challenger knows the secret key so
he can make signatures right now if you
think about a real-life application
and a real-life attacker would be able
to see valid signatures from their
would-be victim on a number of different
documents and maybe the attacker could
even manipulate the victim into signing
innocuous-looking documents if that’s
useful to the attacker
so in our game we’re going to allow the
attacker to get signatures on some
documents of his choice and we see that
in the diagram like this the attacker is
going to send over a message m0 to the
Challenger and the Challenger is going
to sign that message and send the
signature back the attacker can look at
that scratch its head a little bit and
send over another message m1 the
Challenger will sign that and we do that
for as long as the attacker wants the
attacker can send over any sequence of
messages he wants and get signatures on
once the attacker is satisfied that he’s
seen enough signatures and we’re gonna
let him see only a plausible number then
he’s going to pick some message m that
he wants to forge a signature on and
he’s going to try to forge a signature
and of course there’s a rule that says
that this M this message that he’s
trying to forge a signature on isn’t one
of the ones that messages that he’s
already seen because it would be really
easy for him to Ford to send over a
valid signature on m0 I mean we sent him
a valid signature on m0 earlier so he’s
going to pick some other message that he
hasn’t seen a signature for already and
he’s going to send over what he claims
is a signature on that message and then
the question is can he succeed so the
Challenger is going to run the verify
algorithm use the public verification
key and on that message and the
signature that the attacker provided and
is going to check whether it verifies
and if it does verify if this returns
true then the attacker wins the attacker
has forged a message and so this game is
what we use to define what it means for
a digital signature scheme to have the
unforgeable ‘ti property and if we want
to get really precise what we say is
that the attackers probability of
winning this game is negligible and that
that’s true no matter what algorithm the
attacker is using in other words we’re
going to say that the signature scheme
is unforgeable if no matter what
algorithm the attack
is using the attacker has only a
negligible chance of successfully
forging a message and if we have that
property together with the much easier
property that valid messages verify then
we have a digital signature scheme that
is suitable okay now there’s a bunch of
practical things that we need to do to
turn that algorithmic idea into a more
practically implementable signature
mechanism for example the algorithms we
talked about our randomized at least
some of them will be and so we need a
good source of randomness and this the
importance of this really can’t be
underestimated banned randomness will
sink you your algorithm will be insecure
and I’ll just point out here that
attacks on the source of randomness are
a favorite trick of intelligence
agencies and those are the people who
know what kinds of attacks are likely to
be successful in practice there’s a
limit on the message size that you’re
able to sign because real schemes are
going to operate on bit strings of
limited length the fix to that is simply
to use the hash of the message rather
than the message itself that way the
message can be really big
but the hash will be only 256 bits and
because hash functions are collision
free it’s safe to use the hash of the
message as the input to the digital
signature scheme rather than the message
and by the way a fun trick which will
see used later is that you can sign a
hash pointer and if you sign a hash
pointer then the signature covers or
protects the whole structure not just
the hash pointer itself but everything
it points to and everything it points to
for example if you were to sign the hash
pointer that was at the end of a
blockchain the result is that you would
effectively be digitally signing the
entire contents of that blockchain
that’s a useful trick that will see used
later okay now let’s get into the nuts
and bolts Bitcoin uses a particular
digital signature scheme that’s called
ECDSA that’s the elliptic curve digital
signature algorithm and it’s a US
government standard and we won’t go into
all the details of how ECDSA works it
relies on some extremely hairy math and
trust me you don’t want to see all the
details of how that works you can look
it up if you’re interested so we’ll skip
that one thing I’ll note though is with
ECDSA good randomness I said this before
but I’ll say it again
because it’s really essential good
randomness is especially essential with
ECDSA if you use bad randomness in
generating keys or even in signing you
probably leaked your private key it
stands to reason that if you use bad
randomness in generating a key that the
key that you generate is maybe not
secure but it’s a quirk of ECDSA that if
you use even if you use bad randomness
just in making a signature using your
perfectly good key that also will leak
your private key and then it’s game over
so we need to be especially careful
about this in practice this is a common
mistake so that completes the discussion
of digital signatures as a cryptographic
primitive and then the next segment
we’ll move on and talk about some
applications of digital signatures that
will turn out to be useful in building
crypto currencies
in segment one point four will move on
having covered digital signatures and
talk about a nice trick that we can use
that goes along with digital signatures
and the useful trick is this the idea is
to take a public key of one of those
public verification keys from a digital
signature scheme and equate that to an
identity that is an identity of a person
or an actor in a system so if you see a
signature that verifies correctly
that is if you see a signature such that
you can verify with someone’s public key
that that that is a signature on a
particular message then you can think of
that as that public key saying the
message you can literally think of a
public key as kind of like an actor or
an inn or a a party in a system and that
they can make statements by signing
those statements and so if you think in
that mindset then this public key is
like an identity it’s an actor who can
do stuff in the system and if you think
about it then for someone to speak for
pk that is for someone to be able to
make statements so that will be seen as
coming out of P K’s mouth you have to
know the matching secret key SK and so
if you know the secret key that
corresponds to the public key PK then
you can sign messages with that secret
key and what you’re doing essentially is
making statements on behalf of that
public key and that means that there is
an identity in the system which only you
can speak for and of course that’s what
you want an identity to be something
that one person can speak for or on
behalf of that everybody can can see all
right so if we’re going to treat public
keys as identities one of the
consequences of that is that you can
make a new identity whenever you want if
you want to make a new identity you just
do this you create a new random key pair
SK and PK by doing the generate Keys
operation in our digital signature
scheme and we get out SK and PK PK is
then the public name that you can use
that’s the name of that identity what
it’s called although in practice you’d
probably use the hash of PK because
public keys are big but again that will
leave that aside is it
so PK or the hash of it is the public
name that you use to talk about the
identity and SK the secret key is the
information that lets you the person who
generated this identity speak for the
identity you control the identity
because only you know the secret key and
if you generated this in such a way that
the public key PK looks random then
nobody needs to know who you are
you can generate a fresh identity that
looks random that looks that that looks
like a face in the crowd that only you
can control this brings us to the idea
of decentralized identity management
then rather than having a central place
that you have to go in order to register
as a user in a system you don’t need to
get a username you don’t need to inform
someone that you’re going to be using a
particular name if you want a new
identity just make one anybody can make
a new identity at any time and you can
make as many as you want if you prefer
to be known by five different names no
problem just make five identities if you
want to be somewhat anonymous for a
while you can make a new identity use it
just for a little while then throw it
away all of these things are possible
with decentralized identity management
and there’s no central point of control
so that you don’t have to have anyone
who’s in charge of it the system
operates in an entirely decentralized
way and this is the way Bitcoin in fact
does identity these identities are
called addresses in Bitcoin jargon and
so you hear the term address used in
talking about Bitcoin and crypto
currencies but what that really is is
just a public key or a hash of a public
key it’s an identity that someone made
up out of thin air as part of this
decentralized identity management scheme
now the obvious question that arises
when you’re talking about decentralized
identity management and people making up
these identities is how private is this
and the answer is it’s complicated on
the one hand the addresses that are made
up this way are not connected to your
real-world identity you can execute a
randomized algorithm it will make some
kind of PK that looks kind of random and
nothing exists initially to connect that
to who you are you can do that in the
privacy of your own home so that’s good
news if you want to be able to act
privately but
the bad news if you want to act
privately is that if that address if
that identity is making a series of
statements over time if it’s engaging in
a series of Acts over time that people
can see that whoever this is has done a
certain series of actions and they can
start to connect the dots gee this
person is acting a lot like Joe maybe
this person is Joe and so an observer
can link together these things over time
and make inferences and so this balance
between on the one hand there being no
initial tie to real-world identity and
on the other hand that a pattern of
behavior of an address emerging over
time and the question of what can be
inferred in which dots can be connected
that gets pretty complicated and that’s
really the question of privacy in a
crypto currency like Bitcoin and there’s
a whole lecture about that later on and
so I’m not going to steal the thunder of
that lecture I just want to give you an
idea of how decentralized identity makes
privacy a complicated question
in segment 1.5 we’re going to move from
talking about cryptography and we’re
going to move on to crypto currencies
now I know that many of you showed up
here for the crypto currency stuff and
and trust me there will be a lot more
crypto currency material in future
unfortunately we needed to eat some of
our cryptographic vegetables in order to
have the background to talk about crypto
currencies and now you’ll see when I
start talking about crypto currencies
how these pieces fit together and why
the cryptographic operations why cash
functions and digital signatures are
actually useful alright so in this
section I want to talk about some
simplified crypto currencies that give
us ideas about how systems like Bitcoin
work of course it’s going to require
about ten more lectures in order to
really spill out all of the implications
of how Bitcoin works and and what that
means but but let me talk about some
very simple crypto currencies to get the
discussion started and first let’s talk
about goofy coin goofy coin is about the
simplest cryptocurrency we can imagine
and it works kind of like this there are
just a couple rules of goofy coin the
first rule is that goofy can create new
coins goofy can make a new coin whenever
he wants and when he makes a new coin it
belongs to him so when goofy makes a
coin it’s represented by a data
structure like this here you have the
create coin operation and there’s a
unique coin ID that goofy generated and
then there’s a digital signature that
was put on it by goofy which anyone can
verify so anyone being given this can
verify that the signature is valid and
that it’s a signature of this statement
and new coins belong to goofy by
definition because those are the rules
that goofy made so that’s the first rule
goofy can create new coins the second
rule of goofy coin is that whoever owns
a coin can pass it on to someone else
they can spend it so for example here we
have the coin that I showed you before
that goofy created and now we’re going
to take a hash pointer to that coin and
and then we’re going to create a
statement Goofy’s going to make a
statement that says pay this to Alice
Alice is being named by a public key
here pay to public key Alice
the coin that’s represented by this hash
pointer and this in is also signed by
goofy now goofy is the one who owned
that coin and so goofy has to sign any
any transaction that spends the coin and
once this has happened now Alice owns
the coin Alice owns the coin and Alice
can prove that she owns the coin because
she can present this data structure here
which is validly signed by goofy and
points to a coin that was validly owned
by goofy and so this the correctness of
this coin is self-evident in the system
now Alice can move on and she can spend
the coin as well so here we have the
coin we had before this is down here at
the bottom we have the creation of the
coin signed by goofy now goofy paid the
coin to alice via this hash pointer and
he signed that now alice is the owner of
the coin now she can create a statement
like this that says pay this coin to
Bob’s public key and here’s a hash
pointer to the coin and now Alice signs
that son because Alice was the valid
owner of the coin which we could verify
by walking this chain now we know that
this is valid and the coin belongs to
Bob so Bob is now the owner of this coin
so those are all the rules of goofy coin
goofy can create new coins by simply
signing a statement that he’s making a
new coin with a unique coin ID and then
whoever owns a coin can pass it on to
someone else by signing a statement
saying pass on this coin to person X and
you can verify the validity of a coin by
simply following the chain and verifying
all of the signatures along the way
that’s goofy coin all right now there’s
a problem though there’s a big security
problem with goofy coin and we can see
it in this structure here so look at
this coin here this is the coin that
goofy made and then paid to Alice Alice
was the owner of that coin and there’s a
Alice paid this coin on to Bob but now
Alice makes another data structure like
this which pays this pays to chuck the
very same coin and this is signed by
Alice now if truck doesn’t know about
this thing up on the upper left this
data structure let’s say Alice just gave
that to Bob and didn’t tell Chuck now
Chuck will look at this and he’ll think
that this is perfectly valid and now
he’s the owner of the coin Chuck
has a valid looking claimed to be the
owner of this coin and Bob has an
equally valid looking claim to be an
owner of this coin and that’s a problem
because coins are not supposed to work
that way this is called a double
spending attack it’s called double
spending because alice is spending the
same coin twice and double spending
attacks are one of the key problems that
a cryptocurrency has to solve goofy coin
does not solve the double spending
attack and therefore goofy coin is not
so although goofy coin is simple and we
understand its rules
it won’t cut it as a cryptocurrency
because it allows double spending so in
order to build a cryptocurrency that is
going to be workable we need to have
some solution to the double spending
problem and indeed the double spending
problem is the main design challenge
that we face in designing a
cryptocurrency so we need to somehow
improve on goofy coin and we’ll do that
by designing another coin which I’ll
call Scrooge coin Scrooge coin is going
to be rather like goofy coin except it
will solve the double spending problem
in a particular way and this coin was
created by Scrooge okay so this is a
little bit more complicated in terms of
data structures but here’s one of the
key ideas that Scrooge is going to
publish a history of all the
transactions that have happened this
will be a blockchain that data structure
we talked about before and it will be
digitally signed by Scrooge so anyone
can and it looks like this of course
it’s a series of blocks data blocks each
block will have one transaction in it
this block has the transaction with
transaction ID number 73 and it has the
contents of this transaction and then
there’s a hash pointer to the previous
block in the history okay and then and
then Scrooge will take the hash pointer
which represents this entire structure
and he’ll digitally sign it and publish
it now anybody can verify that Scrooge
really did sign this hash pointer and
then they can follow this chain all the
way back and see what is the entire
history of all the transactions in the
history of Scrooge coin as endorsed by
Scrooge okay now I said here that we put
one transaction in each block we do that
for simplicity of explanation but in
practice as an optimization we’d really
put multiple transactions into the same
block as Bitcoin does so you can bear in
mind as I talked about
Scrooge coin that that’s the way we’d
really do it in practice so Scrooge
publishes this history what does the
history do well the thing the history
does for us is it allows us to detect
double spending because assume Alice
owns a coin and she’s going to pay that
coin on to Bob and she’s then later
going to try to pay that coin on to
Charlie Charlie’s going to notice that
something is wrong because Charlie will
be able to look into the history and see
that Alice already paid that coin to Bob
in fact everyone will be able to see
that Alice already paid that coin to Bob
so if she tries to pay that coin to
chuck then everyone can see that that’s
a double spend and they’ll be able to
reject it Scrooge will be ejected and
everyone else will reject it and know
that they really shouldn’t trust Alice
all right so in Scrooge coin there are
two kinds of transactions the first kind
is a create coins transaction and what
it does is create new coins that’s like
the operation goofy could do in goofy
coin that makes a new coin but here
we’re going to allow multiple coins to
be created in one transaction so here’s
what a create coins transaction looks
like it has transaction ID number 73
let’s say in this case it’s transaction
type is create coins and then down here
there’s a list of which coins are
created each coin is going to have a
serial number within this transaction 0
1 2 etc each coin has a value it’s worth
a certain number of Scrooge coins and
each coin has a recipient which is going
to be a public key who gets that coin as
its created so this transaction types
creates a bunch of new coins and assigns
them to people as initial owners now
we’re going to have a concept in Scrooge
coin of a coin id that refers to a
particular coin so this particular coin
here is coin id 73 / n 0 because it was
created in transaction 73 and it was
number 0 within that transaction
similarly we have 73 / N 1 73 / n 2 and
so on so every coin in Scrooged coin has
a coin id that we can use to refer to it
a create coins transaction is always
valid why is it valid well because
Scrooge said so and they call it Scrooge
coin for a reason
if scrooge puts this into the history
which he signs then it’s valid by
we don’t need to worry about whether
Scrooge is entitled to create coins just
like we didn’t need to worry in goofy
coin about whether goofy is entitled to
create coins the rules of the system
which were created by Scrooge simply say
that if Scrooge wants to make coins then
that’s valid so anything he puts into
the history is valid the second kind of
transaction we’re going to talk about is
app a coins transaction and this is a
transaction that consumes some coins and
destroys them and creates new coins of
the same total value but which might
belong to different people so over here
on the Left we have an example of what a
pay coinsurance action looks like this
is transaction ID number 73 let’s say
it’s type is pay coinsurance earned and
destroyed by this pay coins transaction
so we’re going to add up the value of
all of those coins and then we’re going
to create a bunch of new coins down here
0 1 & 2 etc just like before in the
create coins transaction each one has a
value each one will belong to a certain
recipient and those new coins had better
add up to the same total value as the
coins that we consume and then at the
bottom we have a set of digital
signatures this transaction has to be
signed by everyone who’s paying in a
coin so if you’re the owner of one of
the coins that’s going to be consumed in
this transaction
then you need to digitally sign the
transaction to say that you’re really
okay with spending this coin the rules
of Scrooge coins say that app a coins
transaction is valid if four things are
true first if the consumed coins are
valid that is they really were created
in previous transactions second that the
consumed coins were not already consumed
in some previous transaction that is
that this is not a double spend third
that the total value of the coins that
come out of this transaction is equal to
the total value of the coins that went
in and finally that the transaction is
validly signed by the owners of all of
the consumed coins if all of those
things are true then this pay coins
transaction is valid Scrooge will accept
it he’ll write it into the history into
the blockchain and everyone will see
that this transaction has happened one
thing to note about this scheme is that
coins are immutable
coins are never changed they’re never
subdivided they’re never combined all
they are is created once in one
transaction and then later consumed in
some other transaction but you can get
the same effect as being able to
subdivide or pay on or combine coins by
using transactions for example if you
want to subdivide a coin you can just
create a new transaction that consumes
that one coin and then produces two new
coins of the same total value and if you
want you can give those two new coins
back to yourself that’s a way that you
can subdivide a coin that you own
similarly you can combine coins or you
can pay on a coin in effect by just
creating a chain of transactions each of
which pass that value on in the form of
a new coin to someone else
so although coins are immutable in this
system it has all of the flexibility of
a system that didn’t have immutable
coins okay now we come to the core
problem with Scrooge coin Scrooge coin
will work people can see which coins are
valid it prevents double spending
because everyone can look into the into
the blockchain and see that all of the
transactions are valid and that every
coin is consumed only once but the
problem is Scrooge Scrooge thinks this
is fine right Scrooge says don’t worry
I’m honest but the fact is if Scrooge
starts misbehaving then we’re going to
have a problem or if Scrooge just gets
bored of the whole Scrooge coin scheme
and stops doing the things that he’s
supposed to do then the system won’t
operate anymore and so the problem we
have here is centralization that
although Scrooge is happy with this
system we as users of it might not be so
the central technical challenge that we
need to solve in order to improve on
Scrooge coin is can we discourage áfiveá
system that is can we get rid of that
centralised Scrooge figure can we have a
cryptocurrency that operates like
Scrooge coin in many ways but doesn’t
have any central trusted authority in
order to do that we’re going to need to
figure out how to provide the services
that Scrooge provides but do it in a
decentralized way and a way in which no
particular party is particularly trusted
that means we’re going to need to figure
out how every
can agree upon a single published
blockchain that is the agreed-upon
history of which transactions have
happened we need to figure out how
people can agree which transactions are
valid and in which transactions have
actually occurred and we need to figure
out how we can assign IDs to things in a
decentralized way if we can solve all of
those problems then we can build a
currency that is very much like Bitcoin
which is like Scrooge coin but without a
centralized party but in order to do
that it’s going to take a few more
lectures and we hope you’ll stick around
and watch them thanks