# Introduction to Crypto and CryptoCurrencies – Crypto Academy Lecture 1

Crypto Slo cryptocurrency news and

investing

welcome to the first lecture in our

series on Bitcoin and crypto currencies

I want to start by introducing the four

lecturers who are going to be speaking

in the series the first lecturer is

Joseph be no he’s a postdoctoral

researcher in computer science at

Princeton University

the second lecturer is me Edie Felton

I’m a professor at Princeton in computer

science and in the Woodrow Wilson School

the third lecture is Arvind Narayan and

he’s a computer science professor at

Princeton and fourth our special guest

lecturers Andrew Miller he’s a PhD

student in computer science at the

University of Maryland there will be 11

lectures in total in this lecture number

one we’re going to do two things first

we’ll introduce some cryptographic

primitives that turn out to be necessary

for talking about crypto currencies in

particular we’ll talk about

cryptographic hashes and digital

signatures and we’ll talk about some of

the ways in which those are used to

build crypto currencies and then at the

end of the lecture we’ll start talking

about crypto currencies and I’ll give

some examples of simple crypto

currencies that illustrate some of the

design challenges that we need to deal

with I want to apologize for covering

the cryptographic material at the

beginning unfortunately we have to eat

some of our vegetables a little bit in

order to let to lay groundwork for the

crypto currency stuff so if you came for

the crypto currency stuff let me assure

you first of all that we will get to it

in this lecture and that having laid the

groundwork in this lecture there’s going

to be a lot more specifically crypto

currency focused material in later

lectures all right so let’s get to it

in segment 1.1 we’re going to talk about

cryptographic hash functions we’ll talk

about what they are and what their

properties are and then later we’ll move

on and talk about what their

applications are so a cryptographic hash

function is a mathematical function and

it has three attributes that we need to

start with first of all a hash function

can take any string as input absolutely

any string of any size it produces a

fixed size output will use 256 bits in

this series of lectures because that’s

what Bitcoin does and it has to be

efficiently computable meaning given a

string it in a reasonable length of time

you can figure out what the output is so

that’s a hash function but we’re going

to need hash functions that are

cryptographically secure the the

cryptographic properties of hash

functions are a complicated topic in

general but we’re going to focus here on

three particular properties and I’ll

explain in a minute what those are in

particular that the function is

collision free that it has a hiding

property and that it’s puzzle friendly

and for each of these I’ll talk about

what the property is what it means and

then I’ll talk about why it’s useful to

have a function that has that property

so first collision-free so the first

property that we need from a

cryptographic hash function is that it’s

collision free and what that means is

that it’s impossible nobody can find

values x and y such that x and y are

different and yet the hash of x is equal

to the hash of y and so if we look at

the operation of the function as

depicted by one of these red arrows

here’s X and H of X and here’s Y and H

of Y that nobody can find a situation

like this that you have an x and y that

are separate and yet when you hash them

they hash to the same value now one

thing to notice is that I said nobody

can find I didn’t say that there is no

collision because if you think about it

there has to be a collision collisions

do exist and to understand why that is

we can we can use this diagram over here

on the Left I’m depicting all of the

possible inputs to this function which

can be a string of any size and over

here I have all of the possible outputs

which has to be a string of 256 bits in

size so the right-hand side here the

outputs there are only two to the 256

possibilities over here there are more

possibilities

so if you think that every point over

here on the left is going to be mapped

by an arrow to some point on the right

you can see that as you go from all the

points over here on the left into the

right it has to get crowded and in fact

that there have to be multiple values

over here on the left that mapped to the

same output over here in fact in general

there will be a very large number of

possible inputs that map to any

particular output so collisions do exist

I said before nobody can find a

collision and that’s the key question

we know collisions exist the question is

are there any collisions that are

findable by regular people using regular

computers okay now to make things even

worse I said that it has to be

impossible to find a collision let me

tell you how to find a collision because

there’s a method that’s guaranteed to

work and the method works like this that

we’re going to pick two to the 130

randomly chosen inputs over on the left

cloud of that previous diagram and if we

pick those two to the 130 randomly

chosen inputs it turns out there’s a

ninety-nine point eight percent chance

that at least two of them are going to

collide and so this is a very simple

method for finding a collision it works

no matter what the hash function is but

of course the problem is that this takes

a very very long time to do you have to

compute the hash function to to the one

hundred and thirty times and and that’s

of course an astronomical number this

method works no matter which hash

function we’re using there’s still a

ninety-nine point eight percent

probability that this works and if it

doesn’t work just try it again it’ll

probably work the next time but this

doesn’t really matter and the reason it

doesn’t really matter is that this

procedure takes two to the 130 steps in

order to get to that high probability so

we can say something like this we can

say that if every computer every mate

ever made by humanity was computing

since the in beginning of the entire

universe up to now the odds that they

would have found a collision is still

infinitesimally small so small that it’s

way less than the odds that the earth

will be destroyed by a giant meteor in

the next two seconds which didn’t happen

okay so we know how to find a collision

but this method takes too long to matter

the question is is there some other

method that could be used on a

particular hash function in order to

find a collision and that’s the question

that is harder to to answer is there a

faster way to find collisions well for

some possible values of hash functions

of course there are for example if our

hash function were to simply take the

input modulo two to the 256 that is it

just selected the last 256 bits of the

input then we would know an easy way to

find a collision one collision would be

the values 3 + 3 + 2 to the 256 so for

some possible values of the hash

function it’s very easy to find a

collision for others we don’t know now

one thing I need to note is that there’s

no hash function in existence which has

been proven to be collision free there

are just some that people have tried

really really hard to find collisions

and haven’t succeeded and so we choose

to believe that those are collision free

ok now what good does collision freedom

do us if we can assume that we have a

hash function that is collision free

then we can use that hash function as a

message digest and what I mean by that

is the following that if we know that x

and y have the same hash then it’s safe

to assume that x and y are different

because if someone knew an x and y that

were different that had the same hash of

course that would be a collision since

there’s not a collision that we know of

then knowing the hashes are the same we

can assume that the values are the same

and this lets us use the hash as a kind

of message digest suppose for example

that we had a file a really big file and

we wanted to be able to recognize later

whether another file was the same as the

file we saw the first time right so one

way to do that would be to save the

whole big file and then when we saw

another file later just compare them but

because we have hashes that we believe

are collision free it’s more efficient

to just remember the hash of the

original file then if someone shows us a

new file and claims that it’s the same

we can compute the hash of that new file

and compare the hashes if the hashes are

the same then we conclude that the files

must have been the same and that gives

us a very efficient way to remember

things we’ve seen before and recognize

them again and of course this is useful

because the hash is small it’s only 2

256 bits while the original file might

be really big so hash is useful as a

mess

digest and we’ll see later on in this

lecture and in subsequent lectures why

it’s useful to use hashes of message

digests so the second property that we

want from our hash function is that it’s

hiding and the property that we want is

something like this that if we’re given

the output of the hash function that

there’s no feasible way to figure out

what the input X was the problem is that

this property doesn’t exactly hold and

to understand why that’s the case let’s

look at this example so here what we’re

going to do is an experiment where we

flip a coin and if the result of the

coin flip was heads we’re going to

return the hash of the string heads and

if the result was tails we’re going to

return the hash of the string tails and

now we’re going to ask someone who

didn’t see the coin flip but only saw

this hash output to figure out what the

string was that was hashed that of

course is going to be easy it’s easy in

this scenario to find what the input

string was it’s easy to find X you

simply compute the hash of the string

heads and the hash of the string tails

and you see which one you got and so in

just a couple of steps you can figure

out what X was so the reason this

example failed that is the reason why an

adversary was able to guess what the

string was was that there were only a

couple of possible values of X and so if

we’re going to have a property like this

it needs to be the case that there’s no

value of X which is particularly likely

that is all the X has to be chosen from

a set that’s in some sense very spread

out so this method for the adversary of

just trying all the possible values of X

or just trying a few values of X that

are especially likely is not going to

work so the hiding property that we are

going to need to set up is a little bit

more complicated and the way we’re going

to fix this problem with the common

value X like heads and tails is we’re

going to take the X and we’re going to

put next to it we’re going to

concatenate with it with it a value R

which is chosen from a distribution

that’s really spread out and so this H

of R concatenated with X that means take

all the bits of R and put after them all

the bits of X and so what we’re going to

say is given the hash of our two

with ex that it’s infeasible to find X

and that this will be true in the

formally stated property that if R is a

random value chosen from a distribution

that has high min entropy then given the

hash of our concatenated with X it’s

infeasible to find X and what is time me

an entropy mean well it captures this

intuitive idea that R is chosen from a

distribution that’s really spread out

and what that means specifically is that

there’s no particular value that r could

have had that would occur with more than

a negligible probability so for example

if R is chosen uniformly from among all

of the strings that are 256 bits long

then any particular string was chosen

with probability 1 in 2 to the 256 which

is truly a negligible value so as long

as R was chosen that way then the hash

of our concatenated with X is going to

hide X and that’s the hiding property

that the hash function will be deemed to

have okay now let’s look at an

application of that hiding property and

in particular what we want to do is

something called a commitment and this

is kind of the digital analogy of taking

a value a number and sealing it in an

envelope and putting that envelope out

on the table where everyone can see it

now when you do that you’ve committed to

what’s in the envelope but you haven’t

opened it it’s secret from everyone else

later you can open the envelope and get

out the value but it’s sealed so commit

to a value and reveal it later we want

to do that in a digital sense so to be

more specific about what is the API that

we’re going to provide here the

commitment API looks like this that

there are two things you can do first

you can commit to a message and that’s

going to return two values a commitment

and a key think of the commitment as the

envelope that you’re going to put on the

table and the key as a secret key for

unlocking the envelope then later you

allow someone else to verify given a

commitment and given a key which you’ve

told them in the meantime and the

message they can verify that that

commitment key and message really do go

together and this will return true or

false

okay now to seal MSG in an envelope what

we do is we commit to the message and

that returns a commitment and a key and

then we publish the commitment that’s

putting the envelope on the table now

later to open the envelope what we’re

going to do is Pablo

the key and the message that we

committed to and then anybody can use

this verify call with the commitment

that we published previously the key and

message that we just announced to check

the validity of of our opening the

envelope okay and the property of course

we want from this is that it behaves

like sealing an envelope and in

particularly the two security properties

are these first give and calm the

commitment the envelope on the table

that someone just looking at the

envelope can’t figure out what the

message is the second property is that

it’s binding that when you commit to

what’s in the envelope you can’t change

your mind later that is it’s infeasible

to find two different messages such that

you can commit to one message and then

later claim that you committed to

another and the whole thing will verify

okay so how do we know that these two

properties hold well first we need to

talk about how we’re going to actually

implement commitments and the way we’re

going to implement commitments is like

this that in order to commit to a value

message we’re going to generate a random

256 value bit value and call it the key

and then we’re going to as the

commitment return the hash of the key

concatenated together with the message

and as and as the key value we’re going

to return h of this of the of this key

and then later to verify someone is

going to compute this same hash of the

key they were given concatenated with

the message and they’re going to check

whether that’s equal to the commitment

that they saw okay so this is a way of

using hash functions both in the

commitment and in the verification so

now the security properties if we go

down to the security properties that

were at the bottom of the previous slide

and we just plug in the definitions of

how we’re going to implement this here

that is this used to say comm given comm

infeasible defined message we just plug

in what comm is comm is that hash of key

concatenated with message and similarly

down here this is what happens when we

take what was written there before and

plug in the definition of verify income

okay so now what these properties become

the first one is given H of key

concatenated with message it’s

infeasible to find message well it turns

out that that’s exactly the hiding

property that we talked about before key

was chosen as a 2 random 256 bit value

and therefore the the hiding property

says that if we take the message and we

put before it’s something that was

chosen from a very spread out

distribution like I said a random 256

bit value then it’s infeasible to find

the message so this is exactly the

hiding property and this one down here

turns out to be exactly the the

collision-free property so that if

someone can find two messages which have

the same hash like this well then they

have an input value here and an input

value there that are different and yet

does have the same hash and so because

of the two security properties we’ve

talked about for hashes so far this

commitment scheme will work in the sense

that it will have the necessary security

properties okay so that’s the second

security property of hashes that they’re

hiding and the application of that is

commitments the third security property

we’re going to need is that their puzzle

friendly and this is again a little bit

more complicated but let me just go

through it a bit by bit that for any

possible output value Y that you might

want from the hash function we’re going

to use Y as an output value of the hash

later that if K is chosen from a

distribution that has high min entropy

that is K is chosen randomly from some

set that’s super spread out then there’s

no way to find an X such that the hash

of K and X is equal to Y so what this

means is basically that if someone wants

to target the hash function if they want

it to come out to some particular output

value Y that if there’s part of the

input that is chosen in a suitably

randomized way that it’s very difficult

to find another value that hits exactly

that target so the application we’re

going to use of this is we’re going to

build a search puzzle and what that

means is we’re going to build a

mathematical problem which requires

searching a very large space in order to

find the solution and where there’s no

shortcuts a way to find a good solution

other than searching that large space

that’s that’s a search puzzle to be more

specific the idea is that if we’re given

a puzzle id which is chosen from some

high min entropy distribution that is

some very spread out probability

distribution and we’re given a target

set Y which someone try

wants to make the hash function fall

into then we want to try to find a

solution X so that if we hash the puzzle

ID together with the solution X we get a

result that’s in the set Y so the idea

is y is a is a target range or set of

hash results that we want ID specifies a

particular puzzle and X is a solution to

the puzzle and the puzzle friendly

property here implies that there’s no

solving strategy for this puzzle which

is much better than just trying random

values of X and so if we want to if we

want to pose a puzzle that’s difficult

to solve that we can do it this way as

long as we can generate puzzle IDs in a

suitably random way and we’re going to

use that later when we talk about

Bitcoin mining that’s a sort of

computational puzzle we’re going to use

okay so we’ve talked about three

properties of hash functions and one

application of each of those now let me

talk just very briefly about the

particular hash function we’re going to

use there are lots of hash functions in

existence but this is the one Bitcoin

uses and it’s a pretty good one to use

it’s called sha 256 or sha 256 and it

works like this

basically it takes the message that

you’re hashing and it breaks it up into

blocks that are 512 bits in size the

message isn’t going to be in general

necessarily exactly a multiple of the

block size so we’re going to add some

padding at the end and the padding is

going to consist of at the end of the

padding a 64 bit length field which is

the length of the message in bits and

then before that it’s going to consist

of a 1 bit followed by some number of 0

bits and you choose the number of 0 bits

so that this comes out exactly 2 to the

end of a block so once you’ve padded the

message so that its length is exactly a

multiple of the 512 bit block size you

then chop it up into blocks and you then

execute this computation you start with

the 256 bit value called the IV that’s

just a number that you look up in a

standards document and then you take the

IV and the first block of the message

you take those 768 total bits and you

run them through this special function C

and the compression function and out

comes 256 bits you now take that with

the next 512 bits of the message run it

through C again and you keep going each

iteration of C crunches

another 512 bit block of the message and

mixes it in sort of logically to the to

the to the result and when you get to

the very end you have consumed all of

the blocks of the message plus the

padding the result is the hash that’s a

256 bit value and it’s easy to show that

if this function see this compression

function is collision free then this

entire hash function will also be

collision free the other properties are

a little bit more complicated so I won’t

talk about them here okay so we’ve

talked about hash functions we’ve talked

about what hash functions do we’ve

talked about three properties of hash

functions and applications of those

properties and the specific hash

function that we use in Bitcoin in the

next lecture segment we’ll talk about

ways of using hash functions to build

more complicated data structures that

are used in distributed systems like

Bitcoin in section 1.2 we’re going to

talk about hash pointers and their

application a hash pointer is a kind of

data structure that turns out to be used

a lot in the systems that we’re talking

about and a hash pointer is basically a

simple thing that we’re going to take a

pointer to where some information is

stored and we’re going to together with

that pointer store a cryptographic hash

of the information so whereas a regular

pointer gives you a way to retrieve the

information a hash pointer is going to

let us ask to get the information back

and it’s also going to let us verify

that the information hasn’t changed so a

hash pointer tells us where something is

and what its value was okay and we’re

going to draw a hash pointer in diagrams

like this then we’re going to have we’re

gonna have H of and then an arrow that

points to something

so anything drawn like this they think

of it as being a hash pointer to do this

this thing it’s a pointer to where it’s

stored and it’s also the hash of the

value that this data had when we last

saw it and we can take hash pointers and

we can use them to build all kinds of

data structures so a key idea here take

any data structure a linked list a

binary search tree or something like

that and implement it with hash pointers

instead of pointers as we normally would

for example here’s a linked list that

we’ve built with hash pointers and this

is a data structure that we’re going to

call a blockchain

so just like a regular linked list where

you have a series of blocks and each

block has data as well as a pointer to

the previous block in the list here the

previous block pointer will be replaced

with a hash pointer so it says where it

is and what the value of this entire

previous block was and we’re going to

store we’re going to remember the head

of the list like this just as a regular

hash pointer okay

and a use case for this for a blockchain

like this is a tamper evident log that

is if we want to build a log data

structure that stores a bunch of data so

that we can add data on to the end of

the log but if somebody goes later and

messes with data that is earlier in the

log we’re going to detect it that’s what

the tamper evidence means so to

understand why a blockchain gives us

this tamper-evident property let’s ask

what happens if an adversary wants to go

back and tamper with data later that’s

in the middle of the chain okay so let’s

assume that an adversary wants to tamper

with this block down here he wants to

change the data here and he wants to do

it in such a way that we the holders of

the of the hash pointer at the head here

won’t be able to detect it all right so

the adversary changed the contents of

this block and therefore the hash here

which is a hash of this entire block is

not going to match up because the hash

function is collision free it must be

the case that the hash of this block is

now different and so we could detect the

inconsistency between this data and the

hash pointer that we remembered before

or we could do that in unless the

adversary also tampers with the hash

pointer if he tampers with this hash

pointer then he makes these to match up

but now he’s changed the content of this

block and what that means is that when

we come back later and hash the contents

of this block it’s not going to match

the hash that we remembered before

because the contents of the block has

changed and so we’re going to detect the

inconsistency between this the contents

of this block and this hash unless the

adversary also tampers with the block

over here on the right but now when he

does that the hash of this block is not

going to match the hash that we

remembered up here and the hash that

we’re holding on to and this the

adversary can’t

tamper with because this is the value

that we remembered as being the head of

the list and so the upshot of this is

that if the adversary wants to tamper

with data anywhere in this entire chain

in order to keep the story consistent

he’s going to have to tamper with the

hash pointers all the way back to the

beginning and he’s ultimately going to

run into a roadblock because he won’t be

able to tamper with the head of the list

and so what this means is that just by

remembering this hash pointer we’ve

essentially remembered a kind of hash a

tamper-evident hash of the entire list

all the way back to the beginning and so

we can build a block chain like this

containing as many blocks as we want

going back to some special block at the

beginning of the list which we might

call the Genesis block and that’s a

tamper evident log built out of a

blockchain now another useful data

structure that we can build using hash

pointers is a binary tree we can build a

binary tree with hash pointers and this

is called in the jargon a Merkle tree

after ralph merkle who invented it and

the idea is this suppose we have a bunch

of data blocks which we’ll draw across

the bottom down here we’re going to take

pairs consecutive pairs of these data

blocks and for these two data blocks

we’re going to build a data structure

here that has two hash pointers one to

each of these blocks and similarly all

the way across will then go another

level up and this this block here will

will contain a hash pointer of these two

children down here and so on all the way

back up to the root of the tree and then

just like before we’re going to remember

just the hash pointer up here at the

head of the tree and we can then if we

want traverse down through the hash

pointers to any point in the list and we

can make sure that the data hasn’t been

tampered with because just like I showed

you with the with the blockchain if an

adversary tampers with some block down

here at the bottom with the data that

will change that will cause the hash

pointer that’s one level up to not match

so we’ll have to tamper with that and

therefore he’ll have to tamper with the

hash pointer one level up from there and

therefore he’ll have to tamper with the

hash pointer one level up from there and

eventually he’ll get up to the top where

he won’t be able to tamper with the hash

pointer that we’ve remembered so again

any attempt to tamper with any piece of

data across the bottom will be insured

against by just remembering the hash

pointer at the top now another nice

feature about

Merkle trees is that unlike the

blockchain that we built before that if

we want to if someone wants to prove to

us that a particular data block is a

member of this Merkle tree all they need

to show us is is this amount of data so

if we remember just the route and

someone wants to convince us that this

block is in the Merkel tree they need to

show us this block and we can verify

that the hash matches up and then they

need to show us this block and we can

verify that the hash of this matches

that they can show us this block we

verify that the hash of this block

matches this hash pointer and then they

show us the data and just by just by

verifying the hashes up to the root we

can make we can ensure we can verify

that this data block was in the Merkel

tree so that takes about log an items

that we need to be shown and it takes

about login time for us to verify it and

so with the very large number of blocks

and data blocks in the Merkel tree we

can still verify prove membership in a

relatively short time

okay so Merkel trees have various

advantages one advantage of course is

the tree holds many items but we just

need to remember the one root hash which

is only 256 bits we can verify

membership in a Merkel tree in

logarithmic time and logarithmic space

that’s nice and there’s a variant which

is a sorted Merkel tree that’s just a

Merkel tree where we take the blocks at

the bottom and we sort them into some

order say alphabetical lexicographic

order or numerical order or some order

that we agree on now having done that

once we’ve sorted the Merkel tree now

it’s possible to verify non membership

in a Merkel tree that is we can prove

that a particular block is not in the

Merkel tree and the way we do that is

simply by showing a path to the item

that’s just before where that up where

that item would be and just after where

it would be and then we can say look

both of these items are in the Merkel

tree they’re consecutive and therefore

there’s no space in between them there’s

nothing in between them and so the thing

that we’re trying to prove

non-membership

of can’t be there all right so Merkel

tree is a binary search tree built with

hash pointers we can do logarithmic time

membership proofs non-membership proofs

if we sort the tree and it’s very

efficient

more generally it turns out that we can

use hash pointers in any pointer based

data structure as long as the data

structure doesn’t have cycles if there

are cycles in the data structure then we

won’t be able to make all the hashes

match up if you think about it in an a

cyclic data structure we can sort of

start near the leaves or near the the

things that don’t have any pointers

coming out of them compute the hashes of

those and then work our way back sort of

toward the beginning but in the

structure with cycles there’s no end

that we can start with and compute back

from so for example a directed acyclic

graph out of hash pointers and we’ll be

able to verify membership in that day

very efficiently and it will be easy to

compute so this is a general trick that

you’ll see over and over throughout the

distributed data structures and

throughout the algorithms that we talk

about later in this lecture and in

subsequent lectures

in segment 1.3 we’re going to talk about

digital signatures this is the second

cryptographic primitive along with hash

functions that we need as building

blocks for the cryptocurrency discussion

later on so a digital signature is

supposed to be just like a signature on

paper only in digital form and what that

means is this what we want from

signatures is two things

first that just like an idealized paper

signature only you can make your

signature

but anyone who sees your signature can

verify that it’s valid and then the

second thing you want is that the

signature is tied to a particular

document so that somebody can’t take

your signature and snip it off one

document and glue it onto the bottom of

another one because the signature is not

just a signature it signifies your

agreement or endorsement of a particular

document okay so the question is how can

we build this in a digital form using

cryptography so let’s get into the nuts

and bolts here’s an API for digital

signatures there are three things three

operations that we need to be able to do

the first one is we need in the

beginning to be able to generate keys

and so we have a generate keys operation

and we tell it a key size how big in

bits should the keys be and this

produces two keys SK and piqué SK will

be a secret signing key this is

information you keep secret that you use

for making your signature and PK is a

public verification key that you’re

going to give to everybody and that

anybody can use to verify your signature

when they see it okay the second

operation is the sign operation so the

sign operation you take your secret

signing key and you take some message

that you want to put your signature on

and it returns sig which is a signature

it’s just some string of bits that

represents your signature and then the

third operation is a verify that takes

something that claims to be a valid

signature and verifies that it’s correct

it takes the public key of the signer it

takes the message that the signature is

supposedly on and it takes the supposed

signature and it just says yes or no is

this a valid signature okay so these

three operations these three algorithm

constitute a signature scheme and I’ll

note that the first two can be

randomized algorithms the verification

won’t be it will always be deterministic

and in fact if you think about it

generate keys had better be randomized

because it ought to be generating

different keys for different people okay

so the requirements for the signatures

at a slightly more technical level are

the following two requirements first of

all that if a signature is that valid

signatures will verify if a signature is

valid that is if I sign a message with

SK with the seek with my secret key that

if someone then later tries to validate

that using my public key and the same

message that that will that that will

validate correctly so this says that the

Signet that signatures are useful at all

but then the second thing you want is

that it’s impossible to forge signatures

that is an adversary who knows your

public key who knows your verification

key and gets to see signatures on some

other messages can’t forge your

signature on some message that he wants

to forge it on and and in order to

explain this property in a little bit

more detail it’s normally formulated in

terms of a sort of game that we play

with an adversary so the game I’ll

depict it here with this diagram so over

here on the left you have the Challenger

who’s a TV judge and the Challenger is

is going to test a claim by an attacker

the attacker claims that he can forge

signatures and we’re going to test that

claim and the judge will pass judgment

on it the attacker here this guy is

actually with dipping his one of the

inventors of digital signatures of the

concept of digital signatures and a

distinguished cryptographers so I

thought I’d let him play the attacker

role here okay so the game works like

this the first thing we do is we use

generate keys to generate a secret key a

secret signing key and a public

verification key that match up now we

give the secret key to the Challenger to

the judge and we give the public key to

both parties both to the Challenger and

to the attacker so the attacker only

knows information that’s public he only

knows the public key and his mission is

going to be to try to forge a message

the Challenger knows the secret key so

he can make signatures right now if you

think about a real-life application

and a real-life attacker would be able

to see valid signatures from their

would-be victim on a number of different

documents and maybe the attacker could

even manipulate the victim into signing

innocuous-looking documents if that’s

useful to the attacker

so in our game we’re going to allow the

attacker to get signatures on some

documents of his choice and we see that

in the diagram like this the attacker is

going to send over a message m0 to the

Challenger and the Challenger is going

to sign that message and send the

signature back the attacker can look at

that scratch its head a little bit and

send over another message m1 the

Challenger will sign that and we do that

for as long as the attacker wants the

attacker can send over any sequence of

messages he wants and get signatures on

them

once the attacker is satisfied that he’s

seen enough signatures and we’re gonna

let him see only a plausible number then

he’s going to pick some message m that

he wants to forge a signature on and

he’s going to try to forge a signature

and of course there’s a rule that says

that this M this message that he’s

trying to forge a signature on isn’t one

of the ones that messages that he’s

already seen because it would be really

easy for him to Ford to send over a

valid signature on m0 I mean we sent him

a valid signature on m0 earlier so he’s

going to pick some other message that he

hasn’t seen a signature for already and

he’s going to send over what he claims

is a signature on that message and then

the question is can he succeed so the

Challenger is going to run the verify

algorithm use the public verification

key and on that message and the

signature that the attacker provided and

is going to check whether it verifies

and if it does verify if this returns

true then the attacker wins the attacker

has forged a message and so this game is

what we use to define what it means for

a digital signature scheme to have the

unforgeable ‘ti property and if we want

to get really precise what we say is

that the attackers probability of

winning this game is negligible and that

that’s true no matter what algorithm the

attacker is using in other words we’re

going to say that the signature scheme

is unforgeable if no matter what

algorithm the attack

is using the attacker has only a

negligible chance of successfully

forging a message and if we have that

property together with the much easier

property that valid messages verify then

we have a digital signature scheme that

is suitable okay now there’s a bunch of

practical things that we need to do to

turn that algorithmic idea into a more

practically implementable signature

mechanism for example the algorithms we

talked about our randomized at least

some of them will be and so we need a

good source of randomness and this the

importance of this really can’t be

underestimated banned randomness will

sink you your algorithm will be insecure

and I’ll just point out here that

attacks on the source of randomness are

a favorite trick of intelligence

agencies and those are the people who

know what kinds of attacks are likely to

be successful in practice there’s a

limit on the message size that you’re

able to sign because real schemes are

going to operate on bit strings of

limited length the fix to that is simply

to use the hash of the message rather

than the message itself that way the

message can be really big

but the hash will be only 256 bits and

because hash functions are collision

free it’s safe to use the hash of the

message as the input to the digital

signature scheme rather than the message

and by the way a fun trick which will

see used later is that you can sign a

hash pointer and if you sign a hash

pointer then the signature covers or

protects the whole structure not just

the hash pointer itself but everything

it points to and everything it points to

for example if you were to sign the hash

pointer that was at the end of a

blockchain the result is that you would

effectively be digitally signing the

entire contents of that blockchain

that’s a useful trick that will see used

later okay now let’s get into the nuts

and bolts Bitcoin uses a particular

digital signature scheme that’s called

ECDSA that’s the elliptic curve digital

signature algorithm and it’s a US

government standard and we won’t go into

all the details of how ECDSA works it

relies on some extremely hairy math and

trust me you don’t want to see all the

details of how that works you can look

it up if you’re interested so we’ll skip

that one thing I’ll note though is with

ECDSA good randomness I said this before

but I’ll say it again

because it’s really essential good

randomness is especially essential with

ECDSA if you use bad randomness in

generating keys or even in signing you

probably leaked your private key it

stands to reason that if you use bad

randomness in generating a key that the

key that you generate is maybe not

secure but it’s a quirk of ECDSA that if

you use even if you use bad randomness

just in making a signature using your

perfectly good key that also will leak

your private key and then it’s game over

so we need to be especially careful

about this in practice this is a common

mistake so that completes the discussion

of digital signatures as a cryptographic

primitive and then the next segment

we’ll move on and talk about some

applications of digital signatures that

will turn out to be useful in building

crypto currencies

in segment one point four will move on

having covered digital signatures and

talk about a nice trick that we can use

that goes along with digital signatures

and the useful trick is this the idea is

to take a public key of one of those

public verification keys from a digital

signature scheme and equate that to an

identity that is an identity of a person

or an actor in a system so if you see a

signature that verifies correctly

that is if you see a signature such that

you can verify with someone’s public key

that that that is a signature on a

particular message then you can think of

that as that public key saying the

message you can literally think of a

public key as kind of like an actor or

an inn or a a party in a system and that

they can make statements by signing

those statements and so if you think in

that mindset then this public key is

like an identity it’s an actor who can

do stuff in the system and if you think

about it then for someone to speak for

pk that is for someone to be able to

make statements so that will be seen as

coming out of P K’s mouth you have to

know the matching secret key SK and so

if you know the secret key that

corresponds to the public key PK then

you can sign messages with that secret

key and what you’re doing essentially is

making statements on behalf of that

public key and that means that there is

an identity in the system which only you

can speak for and of course that’s what

you want an identity to be something

that one person can speak for or on

behalf of that everybody can can see all

right so if we’re going to treat public

keys as identities one of the

consequences of that is that you can

make a new identity whenever you want if

you want to make a new identity you just

do this you create a new random key pair

SK and PK by doing the generate Keys

operation in our digital signature

scheme and we get out SK and PK PK is

then the public name that you can use

that’s the name of that identity what

it’s called although in practice you’d

probably use the hash of PK because

public keys are big but again that will

leave that aside is it

so PK or the hash of it is the public

name that you use to talk about the

identity and SK the secret key is the

information that lets you the person who

generated this identity speak for the

identity you control the identity

because only you know the secret key and

if you generated this in such a way that

the public key PK looks random then

nobody needs to know who you are

you can generate a fresh identity that

looks random that looks that that looks

like a face in the crowd that only you

can control this brings us to the idea

of decentralized identity management

then rather than having a central place

that you have to go in order to register

as a user in a system you don’t need to

get a username you don’t need to inform

someone that you’re going to be using a

particular name if you want a new

identity just make one anybody can make

a new identity at any time and you can

make as many as you want if you prefer

to be known by five different names no

problem just make five identities if you

want to be somewhat anonymous for a

while you can make a new identity use it

just for a little while then throw it

away all of these things are possible

with decentralized identity management

and there’s no central point of control

so that you don’t have to have anyone

who’s in charge of it the system

operates in an entirely decentralized

way and this is the way Bitcoin in fact

does identity these identities are

called addresses in Bitcoin jargon and

so you hear the term address used in

talking about Bitcoin and crypto

currencies but what that really is is

just a public key or a hash of a public

key it’s an identity that someone made

up out of thin air as part of this

decentralized identity management scheme

now the obvious question that arises

when you’re talking about decentralized

identity management and people making up

these identities is how private is this

and the answer is it’s complicated on

the one hand the addresses that are made

up this way are not connected to your

real-world identity you can execute a

randomized algorithm it will make some

kind of PK that looks kind of random and

nothing exists initially to connect that

to who you are you can do that in the

privacy of your own home so that’s good

news if you want to be able to act

privately but

the bad news if you want to act

privately is that if that address if

that identity is making a series of

statements over time if it’s engaging in

a series of Acts over time that people

can see that whoever this is has done a

certain series of actions and they can

start to connect the dots gee this

person is acting a lot like Joe maybe

this person is Joe and so an observer

can link together these things over time

and make inferences and so this balance

between on the one hand there being no

initial tie to real-world identity and

on the other hand that a pattern of

behavior of an address emerging over

time and the question of what can be

inferred in which dots can be connected

that gets pretty complicated and that’s

really the question of privacy in a

crypto currency like Bitcoin and there’s

a whole lecture about that later on and

so I’m not going to steal the thunder of

that lecture I just want to give you an

idea of how decentralized identity makes

privacy a complicated question

in segment 1.5 we’re going to move from

talking about cryptography and we’re

going to move on to crypto currencies

now I know that many of you showed up

here for the crypto currency stuff and

and trust me there will be a lot more

crypto currency material in future

lectures

unfortunately we needed to eat some of

our cryptographic vegetables in order to

have the background to talk about crypto

currencies and now you’ll see when I

start talking about crypto currencies

how these pieces fit together and why

the cryptographic operations why cash

functions and digital signatures are

actually useful alright so in this

section I want to talk about some

simplified crypto currencies that give

us ideas about how systems like Bitcoin

work of course it’s going to require

about ten more lectures in order to

really spill out all of the implications

of how Bitcoin works and and what that

means but but let me talk about some

very simple crypto currencies to get the

discussion started and first let’s talk

about goofy coin goofy coin is about the

simplest cryptocurrency we can imagine

and it works kind of like this there are

just a couple rules of goofy coin the

first rule is that goofy can create new

coins goofy can make a new coin whenever

he wants and when he makes a new coin it

belongs to him so when goofy makes a

coin it’s represented by a data

structure like this here you have the

create coin operation and there’s a

unique coin ID that goofy generated and

then there’s a digital signature that

was put on it by goofy which anyone can

verify so anyone being given this can

verify that the signature is valid and

that it’s a signature of this statement

and new coins belong to goofy by

definition because those are the rules

that goofy made so that’s the first rule

goofy can create new coins the second

rule of goofy coin is that whoever owns

a coin can pass it on to someone else

they can spend it so for example here we

have the coin that I showed you before

that goofy created and now we’re going

to take a hash pointer to that coin and

and then we’re going to create a

statement Goofy’s going to make a

statement that says pay this to Alice

Alice is being named by a public key

here pay to public key Alice

the coin that’s represented by this hash

pointer and this in is also signed by

goofy now goofy is the one who owned

that coin and so goofy has to sign any

any transaction that spends the coin and

once this has happened now Alice owns

the coin Alice owns the coin and Alice

can prove that she owns the coin because

she can present this data structure here

which is validly signed by goofy and

points to a coin that was validly owned

by goofy and so this the correctness of

this coin is self-evident in the system

now Alice can move on and she can spend

the coin as well so here we have the

coin we had before this is down here at

the bottom we have the creation of the

coin signed by goofy now goofy paid the

coin to alice via this hash pointer and

he signed that now alice is the owner of

the coin now she can create a statement

like this that says pay this coin to

Bob’s public key and here’s a hash

pointer to the coin and now Alice signs

that son because Alice was the valid

owner of the coin which we could verify

by walking this chain now we know that

this is valid and the coin belongs to

Bob so Bob is now the owner of this coin

so those are all the rules of goofy coin

goofy can create new coins by simply

signing a statement that he’s making a

new coin with a unique coin ID and then

whoever owns a coin can pass it on to

someone else by signing a statement

saying pass on this coin to person X and

you can verify the validity of a coin by

simply following the chain and verifying

all of the signatures along the way

that’s goofy coin all right now there’s

a problem though there’s a big security

problem with goofy coin and we can see

it in this structure here so look at

this coin here this is the coin that

goofy made and then paid to Alice Alice

was the owner of that coin and there’s a

problem

Alice paid this coin on to Bob but now

Alice makes another data structure like

this which pays this pays to chuck the

very same coin and this is signed by

Alice now if truck doesn’t know about

this thing up on the upper left this

data structure let’s say Alice just gave

that to Bob and didn’t tell Chuck now

Chuck will look at this and he’ll think

that this is perfectly valid and now

he’s the owner of the coin Chuck

has a valid looking claimed to be the

owner of this coin and Bob has an

equally valid looking claim to be an

owner of this coin and that’s a problem

because coins are not supposed to work

that way this is called a double

spending attack it’s called double

spending because alice is spending the

same coin twice and double spending

attacks are one of the key problems that

a cryptocurrency has to solve goofy coin

does not solve the double spending

attack and therefore goofy coin is not

secure

so although goofy coin is simple and we

understand its rules

it won’t cut it as a cryptocurrency

because it allows double spending so in

order to build a cryptocurrency that is

going to be workable we need to have

some solution to the double spending

problem and indeed the double spending

problem is the main design challenge

that we face in designing a

cryptocurrency so we need to somehow

improve on goofy coin and we’ll do that

by designing another coin which I’ll

call Scrooge coin Scrooge coin is going

to be rather like goofy coin except it

will solve the double spending problem

in a particular way and this coin was

created by Scrooge okay so this is a

little bit more complicated in terms of

data structures but here’s one of the

key ideas that Scrooge is going to

publish a history of all the

transactions that have happened this

will be a blockchain that data structure

we talked about before and it will be

digitally signed by Scrooge so anyone

can and it looks like this of course

it’s a series of blocks data blocks each

block will have one transaction in it

this block has the transaction with

transaction ID number 73 and it has the

contents of this transaction and then

there’s a hash pointer to the previous

block in the history okay and then and

then Scrooge will take the hash pointer

which represents this entire structure

and he’ll digitally sign it and publish

it now anybody can verify that Scrooge

really did sign this hash pointer and

then they can follow this chain all the

way back and see what is the entire

history of all the transactions in the

history of Scrooge coin as endorsed by

Scrooge okay now I said here that we put

one transaction in each block we do that

for simplicity of explanation but in

practice as an optimization we’d really

put multiple transactions into the same

block as Bitcoin does so you can bear in

mind as I talked about

Scrooge coin that that’s the way we’d

really do it in practice so Scrooge

publishes this history what does the

history do well the thing the history

does for us is it allows us to detect

double spending because assume Alice

owns a coin and she’s going to pay that

coin on to Bob and she’s then later

going to try to pay that coin on to

Charlie Charlie’s going to notice that

something is wrong because Charlie will

be able to look into the history and see

that Alice already paid that coin to Bob

in fact everyone will be able to see

that Alice already paid that coin to Bob

so if she tries to pay that coin to

chuck then everyone can see that that’s

a double spend and they’ll be able to

reject it Scrooge will be ejected and

everyone else will reject it and know

that they really shouldn’t trust Alice

all right so in Scrooge coin there are

two kinds of transactions the first kind

is a create coins transaction and what

it does is create new coins that’s like

the operation goofy could do in goofy

coin that makes a new coin but here

we’re going to allow multiple coins to

be created in one transaction so here’s

what a create coins transaction looks

like it has transaction ID number 73

let’s say in this case it’s transaction

type is create coins and then down here

there’s a list of which coins are

created each coin is going to have a

serial number within this transaction 0

1 2 etc each coin has a value it’s worth

a certain number of Scrooge coins and

each coin has a recipient which is going

to be a public key who gets that coin as

its created so this transaction types

creates a bunch of new coins and assigns

them to people as initial owners now

we’re going to have a concept in Scrooge

coin of a coin id that refers to a

particular coin so this particular coin

here is coin id 73 / n 0 because it was

created in transaction 73 and it was

number 0 within that transaction

similarly we have 73 / N 1 73 / n 2 and

so on so every coin in Scrooged coin has

a coin id that we can use to refer to it

a create coins transaction is always

valid why is it valid well because

Scrooge said so and they call it Scrooge

coin for a reason

if scrooge puts this into the history

which he signs then it’s valid by

definition

we don’t need to worry about whether

Scrooge is entitled to create coins just

like we didn’t need to worry in goofy

coin about whether goofy is entitled to

create coins the rules of the system

which were created by Scrooge simply say

that if Scrooge wants to make coins then

that’s valid so anything he puts into

the history is valid the second kind of

transaction we’re going to talk about is

app a coins transaction and this is a

transaction that consumes some coins and

destroys them and creates new coins of

the same total value but which might

belong to different people so over here

on the Left we have an example of what a

pay coinsurance action looks like this

is transaction ID number 73 let’s say

it’s type is pay coinsurance earned and

destroyed by this pay coins transaction

so we’re going to add up the value of

all of those coins and then we’re going

to create a bunch of new coins down here

0 1 & 2 etc just like before in the

create coins transaction each one has a

value each one will belong to a certain

recipient and those new coins had better

add up to the same total value as the

coins that we consume and then at the

bottom we have a set of digital

signatures this transaction has to be

signed by everyone who’s paying in a

coin so if you’re the owner of one of

the coins that’s going to be consumed in

this transaction

then you need to digitally sign the

transaction to say that you’re really

okay with spending this coin the rules

of Scrooge coins say that app a coins

transaction is valid if four things are

true first if the consumed coins are

valid that is they really were created

in previous transactions second that the

consumed coins were not already consumed

in some previous transaction that is

that this is not a double spend third

that the total value of the coins that

come out of this transaction is equal to

the total value of the coins that went

in and finally that the transaction is

validly signed by the owners of all of

the consumed coins if all of those

things are true then this pay coins

transaction is valid Scrooge will accept

it he’ll write it into the history into

the blockchain and everyone will see

that this transaction has happened one

thing to note about this scheme is that

coins are immutable

coins are never changed they’re never

subdivided they’re never combined all

they are is created once in one

transaction and then later consumed in

some other transaction but you can get

the same effect as being able to

subdivide or pay on or combine coins by

using transactions for example if you

want to subdivide a coin you can just

create a new transaction that consumes

that one coin and then produces two new

coins of the same total value and if you

want you can give those two new coins

back to yourself that’s a way that you

can subdivide a coin that you own

similarly you can combine coins or you

can pay on a coin in effect by just

creating a chain of transactions each of

which pass that value on in the form of

a new coin to someone else

so although coins are immutable in this

system it has all of the flexibility of

a system that didn’t have immutable

coins okay now we come to the core

problem with Scrooge coin Scrooge coin

will work people can see which coins are

valid it prevents double spending

because everyone can look into the into

the blockchain and see that all of the

transactions are valid and that every

coin is consumed only once but the

problem is Scrooge Scrooge thinks this

is fine right Scrooge says don’t worry

I’m honest but the fact is if Scrooge

starts misbehaving then we’re going to

have a problem or if Scrooge just gets

bored of the whole Scrooge coin scheme

and stops doing the things that he’s

supposed to do then the system won’t

operate anymore and so the problem we

have here is centralization that

although Scrooge is happy with this

system we as users of it might not be so

the central technical challenge that we

need to solve in order to improve on

Scrooge coin is can we discourage áfiveá

system that is can we get rid of that

centralised Scrooge figure can we have a

cryptocurrency that operates like

Scrooge coin in many ways but doesn’t

have any central trusted authority in

order to do that we’re going to need to

figure out how to provide the services

that Scrooge provides but do it in a

decentralized way and a way in which no

particular party is particularly trusted

that means we’re going to need to figure

out how every

can agree upon a single published

blockchain that is the agreed-upon

history of which transactions have

happened we need to figure out how

people can agree which transactions are

valid and in which transactions have

actually occurred and we need to figure

out how we can assign IDs to things in a

decentralized way if we can solve all of

those problems then we can build a

currency that is very much like Bitcoin

which is like Scrooge coin but without a

centralized party but in order to do

that it’s going to take a few more

lectures and we hope you’ll stick around

and watch them thanks

you