But how does bitcoin actually work?


What does it mean to have a bitcoin?
Many people have now heard of bitcoin, that’s
it’s a fully digital currency, with no government
to issue it and no banks needed to manage
accounts and verify transactions.
That no one actually knows who invented it.
Yet many people don’t know the answer to
this question, at least not in full.
To get there, and to make sure the technical
details underlying this answer feel motivated,
we’re going to walk through step-by-step
how you might have invented your own version
of Bitcoin.
We’ll start with you keeping track of payments
with your friends using a communal ledger.
Then, as you trust your friends and the world
less and less, and if you’re clever enough
to bring in a few tools of cryptography to
help circumvent the need for trust, what you
end up with what’s called a “cryptocurrency”.
Bitcoin is just the first implemented example
of a cryptocurrency, and there are now thousands
more on exchanges with traditional currencies.
Walking the path of inventing your own can
help set the foundation for understanding
some of the more recent players in the game,
and recognizing where there’s room for different
design choices.
In fact, one of the reasons I chose this topic
is in response to the unprecedented leap in
attention, investment and…well.. hype directed
at these currencies in just the last year.
I won’t comment or speculate on the current
or future exchange rates, but I think we’d
all agree that anyone looking to buy a cryptocurrency
should really know what it is.
Not just in terms of analogies with vague
connections to gold-mining, I mean an actual
direct description of what computers are doing
when sending, receiving and creating cryptocurrencies.
One thing worth stressing, by the way, is
that even though you and I will dig into the
underlying details here, which takes some
meaningful time, you don’t actually need
to know those details to use a cryptocurrency,
just like you don’t need to know the details
of what happens under the hood when you swipe
a credit card.
Like any other digital payments, there are
plenty of user-friendly applications that
let you send and receive these currencies
very easily.
The difference is that the backbone underlying
this is not a bank verifying transactions,
but a clever system of decentralized trustless
verification based on some of the math born
in cryptography.
To start, set aside the thought of cryptocurrencies
for a few minutes.
We’re going to start the story with something
more down to earth: Ledgers, and digital signatures.
If you and your friends exchange money pretty
frequently, paying your share of the dinner
bill and such, it can be inconvenient to exchange
cash all the time.
So you might keep a communal ledger that records
payments you intend to make in the future.
Alice pays Bob $20, Bob pays Charlie $40,
things like that.
This ledger will be something public and accessible
to everyone, like a website where anyone can
go and just add new lines.
At the end of every month, you all look through
the list of transactions and tally everything
up.
If you’ve spent more than you received,
you put that money into the pot, and if you’ve
received more than you spent, you take that
much money out.
So the protocol for being part of this system
looks something like this: Anyone can add
lines to the ledger, and at the end of every
month everyone gets together to settles up
with real money.
One problem with a public ledger like this
is that when anyone can add a line, what’s
to prevent Bob from going in and writing “Alice
pays Bob $100” without Alice approving?
How are we supposed to trust that all these
transactions are what the sender meant for
them to be?
This is where the first bit of cryptography
comes in: Digital signatures.
Like a handwritten signature, the idea here
is that Alice should be able to add something
next to a transaction that proves that she
has seen it, and approved of it.
And it should be infeasible for anyone else
to forge her signature.
At first it might seem like digital signatures
shouldn’t even be possible, since whatever
data makes up the signature can just be read
and copied by any computer, so how do you
prevent forgeries?
The way this works is that everyone generates
what’s called a public key/private key pair,
each of which looks like some string of bits.
The private key is sometimes also called the
“secret” key, so that we can abbreviate
it to sk while abbreviating the public key
as pk.
As the names suggest, the secret key is something
you should keep to yourself.
In the real world, your handwritten signature
looks the same no matter what document you’re
signing.
A digital signatures is much stronger, because
it changes for different messages.
It looks like a string of 1’s and 0’s,
commonly something like 256 bits, and altering
the message even slightly completely changes
what your signature on that message should
look like.
Formally, producing a signature involves some
function that depends both on the message
itself, and on your private key.
The private key ensures that only you can
produce the signature, and the fact that it
depends on the message means no one can just
copy one of your signatures to forge it on
another message.
Hand-in-hand with this is a function to verify
that a signature is valid, and this is where
the public key comes into play.
All it does it output true or false to indicate
if this was a signature created by the private
key associated with the public key you use
for the verification.
I won’t go into the details how how exactly
these functions work, but the idea is that
it should be completely infeasible to find
a valid signature if you don’t know the
secret key.
Specifically there is no strategy better than
just guessing and checking if random signatures
are valid using the public key until you hit
one that works.
There are 2^{256} possible signatures with
256 bits, and you’d need to find the one
that work.
This is a stupidly large number.
To call it astronomically large would be giving
way to much credit to astronomy.
In fact, I made a supplemental video devoted
just to illustrating what a huge number this
is.
Let’s just say that when you verify a signature
against a given message and public key, you
can feel extremely confident that the only
way someone could have produced it is if they
knew the secret key associated with the public
key.
There’s one slight problem here: If Alice
signs a transaction like “Alice pays Bob
$100”, even though Bob can’t forge Alice’s
signature on new messages, he could just copy
that same line as many times as he wants,
since the message/signature combination is
valid.
To get around that, we make it so that when
you sign a transaction, the message has to
include some unique id associated with that
transaction.
That way, if Alice pays Bob $100 multiple
times, each transaction requires a completely
new signature.
Alright, great, digital signatures remove
a huge aspect of trust in our initial protocol.
But even still, this relies on an honors system
of sorts.
Namely, you’re trusting that everyone will
actually follow through and settle up in cash
at the end of each month.
But what if, for example, Charlie racked up
thousands of dollars in debt, and just refuses
to show up?
The only real reason to revert to cash to
settle up is if some people, I’m looking
at you Charlie, owe a lot of money.
So maybe you have the clever idea that you
never actually have to settle up in cash as
long as you have some way to prevent people
from spending too much more than they take
in.
What you might do is start by having everyone
pay $100 into the pot, and have the first
few lines of the ledger will read “Alice
gets $100, Bob gets $100, etc.
Now, just don’t accept transactions when
someone is spending more than they have on
the ledger.
For example, after starting everyone off with
$100, if the first two transaction are “Charlie
pays Alice $50” and “Charlie pay Bob $50”,
if he were to try to add “Charlie pays You
$20”, that would be invalid, as invalid
as if he never signed it.
Notice, this means you need to know the full
history of transactions to verify that a new
one is valid.
And this is, more or less, going to be true
for cryptocurrencies as well, though there
is a little room for optimization.
What’s interesting here is that this step
somewhat removes the connection between the
Ledger and physical cash.
In theory, if everyone in the world used this
Ledger, you could live your whole life just
sending and receiving money on this ledger
without ever converting to real US.
To emphasize this point, let’s start referring
to quantities on the ledger as “LedgerDollars”,
or LD for short.
You’re of course free to exchange LedgerDollars
for real US dollars, for example maybe Alice
gives Bob a $10 bill in the real world in
exchange for him adding and signing the transaction
“Bob pays Alice 10 LedgerDollars” to the
communal ledger.
But exchanges like this are not guaranteed
in the protocol.
It’s now more analogous to how you might
exchange Dollars for Euros or any other currency
on the open market, it’s just its own independent
thing.
This is the first important thing to understand
about Bitcoin, or any other cryptocurrency:
What it is is a ledger; the history of transactions
is the currency.
Of course, with Bitcoin money doesn’t enter
the Ledger with people buying into using cash,
I’ll get to how new money enters the ledger
in just a few minutes.
Before that, there’s an even more significant
difference between our current system of LedgerDollars
how cryptocurrencies works.
So far, I’ve said that this ledger is some
public place, like a website where anyone
can add new lines.
But this requires trusting a central location.
Namely, who hosts that website?
Who controls the rules of adding new lines?
To remove that bit of trust, we’ll have
everyone keep their own copy of the ledger.
Then to make a transaction, like “Alice
pays Bob 100 LedgerDollars”, you broadcast
into the world for people to hear and record
on their own private Ledgers.
But unless we do something more, this system
would absurdly bad.
How can you get everyone to agree on what
the right ledger is?
When Bob receives the transaction “Alice
pays Bob 10 LedgerDollars”, how can he be
sure that everyone else received and believes
that same transaction?
That he’ll be able to later use those 10
LedgerDollars to make a trade with Charlie.
Really, imagine yourself just listening to
transactions being broadcast.
How can you be sure that everyone else is
recording the same transactions in the same
order?
Now we’ve hit on an interesting puzzle:
Can you come up with a protocol for how to
accept or reject transactions and in what
order so that you can feel confident that
anyone else in the world following the same
protocol has a personal ledger that looks
the same as yours?
This is the problem addressed in the original
Bitcoin paper.
At a high level, the solution Bitcoin offers
to trust whichever ledger has the most computational
work put into it.
I’ll take a moment to explain what exactly
that means, which involves this thing called
a “Cryptographic hash function”.
The general idea we’ll build to is that
if you use computational work as a basis for
what to trust, you can make it so that fraudulent
transactions and conflicting ledgers would
require an infeasible amount of computation.
Again, this is getting well into the weeds
beyond what anyone would need to know just
to use a currency like this.
But it’s a really cool idea, and if you
understand it, you understand the heart of
bitcoin and other cryptocurrencies.
A hash function takes in any kind of message
or file, and outputs a string of bits with
a fixed length, like 256 bits.
This output is called the “hash” or “digest”
of the message, and it’s meant to look random.
It’s not random; it always gives the same
output for a given input.
But the idea is that when you slightly change
the input, maybe editing just one character,
the resulting hash changes completely.
In fact, for the hash function I’m showing
here, called SHA256, the way that output changes
as you slightly change the input is entirely
unpredictable.
You see, this is not just any hash function,
it’s a cryptographic hash function.
That means it’s infeasible to compute in
the reverse direction.
If I show you some specific string of 1’s
and 0’s and ask you to find an input message
so that the SHA256 hash of that message gives
this exact string of bits, you will have no
better method than to guess and check.
Again, if you want a feel for just how much
computation would be needed to go through
2256 guesses, take a look at the supplement
video.
I actually had way too much fun writing that
thing.
You might think you could reverse engineer
the desired input by really digging through
the details of how the function works, but
no one has ever found a way to do that.
Interestingly, there’s no proof that it’s
hard to compute in the reverse direction,
yet a huge amount of modern security depends
on cryptographic hash functions.
If you were to take a look at what algorithms
underlie the secure connection that your browser
is making with YouTube right now, or that
it makes with a bank, you will likely see
a name like SHA256 in there.
For right now, our focus will just be on how
such a function can prove that a particular
list of transactions is associated with a
large amount of computational effort.
Imagine someone shows you a list of transactions,
and they say “I found a special number so
that when you put this number at the end of
list of transactions, and apply SHA256 the
entire thing, the first 30 bits of the output
are zeros”.
How hard do you think it was for them to find
that number?
For a random message, the probability that
the hash happens to start with 30 successive
zeros is 1 in 230, which is about 1 in a billion.
Because SHA256 is a cryptographic hash function,
the only way to find a special number like
this just guessing and checking.
So this person almost certainly had to go
through about a billion different numbers
before finding this special one.
And once you know the number, you can quickly
verify that this hash really does start with
30 zeros.
In other words, you can verify they they went
through a large amount of work without having
to go through that same effort yourself.
This is called a “proof of work”.
And importantly, all this work is intrinsically
tied to that list of transactions.
If you change one of the transactions, even
slightly, it would completely change the hash,
so you’d have to go through another billion
guesses to find a new proof of work, a new
number that makes it so that the hash of the
altered list together with this new number
starts with 30 zeros.
So now think back to our distributed ledger
situation.
Everyone is broadcasting transactions, and
we want a way for everyone to agree on what
the correct ledger really is.
As I said, the core idea behind the original
bitcoin paper is to have everybody trust whichever
ledger has the most work put into it.
The this works is to first organize a given
ledger into blocks, where each block consists
of a list of transactions, together with a
proof of work.
That is, a special number so that the hash
of the whole block starts with a bunch of
zeros.
For the moment let’s say it has to start
with 60 zeros, but later I’ll return back
to how you might choose that number.
In the same way that a transaction is only
considered valid if it is signed by the sender,
a block is only considered valid if it has
a proof of work.
Also, to make sure there is a standard way
to order of these blocks, we’ll make it
so that a block has to contain the hash of
the previous block.
That way, if you change any block, or try
to swap the order of two blocks, it would
change the block after it, which changes that
block’s hash, which changes the next block,
and so on.
That would require redoing all the work, finding
a new special number for each of these blocks
that makes their hashes start with 60 zeros.
Because blocks are chained together like this,
instead of calling it a ledger, this is commonly
called a “Blockchain”.
As part of our updated protocol, we’ll now
allow anyone in the world to be a “block
creator”.
What this means is that they’ll listen for
the transactions being broadcast, collect
them into a block, then do a whole bunch of
work to find the special number that makes
the hash of this block start with 60 zeros,
and broadcast out the block they found.
To reward a block creator for all this work,
when she puts together a block, we’ll allow
her to include a special transaction at the
top in which she gets, say, 10 LedgerDollars
out of thin air.
This is called the block reward.
It’s a special exception to our usual rules
about whether or not to accept transactions;
it doesn’t come from anyone, so it doesn’t
have to be signed.
It also means that the total number of LedgerDollars
in our economy increases with each new block.
Creating blocks is often called “mining”,
since it requires a lot of work, and it introduces
new bits of currency into the economy.
But when you hear or read about miners, keep
in mind that what they’re really doing is
creating blocks, broadcasting those blocks,
and getting rewarded with new money for doing
so.
From the miners perspective, each block is
like a miniature lottery, where everyone is
guessing numbers as fast as they can until
one lucky individual finds one that makes
the hash of the block start with many zeros,
and gets rewarded for doing so.
The way our protocol will now work for someone
using this system is that instead of listening
for transactions, you listen for new blocks
being broadcast by miners, updating your own
personal copy of the blockchain.
The key addition is that if you hear of two
distinct blockchains with conflicting transaction
histories, you defer to the longest one, the
one with the most work put into it.
If there’s a tie, wait until you hear of
an additional block that makes one longer.
So even though there is no central authority,
and everyone is maintaining their own copy
of the blockchain, if everyone agrees to give
preference to whichever blockchain has the
most work put into it, we have a way to arrive
at decentralized consensus.
To see why this makes for a trustworthy system,
and to understand at what point you should
trust that a payment is legitimate, it’s
helpful to walk through what it would take
to fool someone in this system.
If Alice wants to fool Bob with a fraudulent
block, she might try to send him one that
includes a her paying him 100 LedgerDollars,
but without broadcasting that block to the
rest of the network.
That way everyone else thinks she still has
those 100 LedgerDollars.
To do this, she’d have to find a valid proof
of work before all other miners, each working
on their own block.
And that could happen!
Maybe Alice wins this miniature lottery before
anyone else.
But Bob will still be hearing broadcasts made
by other miners, so to keep him believing
the fraudulent block Alice would have to do
all the work herself to keep adding blocks
to this special fork in Bob’s blockchain
that’s different from what he’s hearing
from the rest of the miners.
Remember, as per the protocol Bob always trusts
the longest chain he knows about.
Alice might be able to keep this up for a
few blocks if just by chance she happens to
find blocks more quickly than all of the rest
of the miners on the network combined.
But unless Alice has close to 50% of the computing
resources among all miners, the probability
becomes overwhelming that the blockchain that
all the other miners are working on grows
faster than the single fraudulent blockchain
that Alice is feeding Bob.
So in time Bob will reject what he’s hearing
from Alice in favor of the longer chain that
everyone else is working on.
Notice that means you shouldn’t necessarily
trust a new block that you hear immediately.
Instead, you should wait for several new blocks
to be added on top of it.
If you still haven’t heard of any longer
blockchains, you can trust that this block
is part of the same chain everyone else is
using.
And with that, we’ve hit all the main ideas.
This distributed ledger system based on a
proof of work is more or less how the Bitcoin
protocol works, and how many other cryptocurrencies
work.
There’s just a few details to clear up.
Earlier I said that the proof of work might
be to find a special number so that the hash
of the block starts with 60 zeros.
The way the actual bitcoin protocol works
is to periodically change that number of zeros
so that it should take, on average, 10 minutes
to find a block.
So as there are more and more miners on the
network, the challenge gets harder and harder
in such a way that this miniature lottery
only has about one winner every 10 minutes.
Many newer cryptocurrencies have much shorter
block times.
All of the money in Bitcoin ultimately comes
from some block reward.
These rewards 50 Bitcoin per block.
There’s a great site called “block explorer”
where you can look through the bitcoin blockchain,
and if you look at the very first few blocks
on the chain, they contain no transactions
other than the 50 Bitcoin reward to the miner.
Every 210,000 blocks, which is about every
4 years, that reward gets cut in half.
So right now, the reward is at 12.5 Bitcoin
per block, and because this reward decreases
geometrically over time, there will never
be more than 21 million bitcoin in existence.
However, this doesn’t mean miners will stop
earning money.
In addition to the block reward, miners can
also pick up transactions fees.
The way this works is that whenever you make
a payment, you can optionally include a small
transaction fee with it that will go to the
miner of whatever block includes that payment.
The reason you might do this is to incentivize
miners to actually include the transaction
you broadcast into the next block.
You see, in Bitcoin, each block is limited
to about 2,400 transactions, which many critics
argue is unnecessarily restrictive.
For comparison, Visa processes an average
of around 1,700 transactions per second, and
they’re capable of handling more than 24,000
per second.Slower processing on Bitcoin means
higher transactions fees, since that’s what
determines which transactions miners choose
to include in new blocks.
This is far from a comprehensive coverage
of cryptocurrencies.
There are many nuances and alternate design
choices I haven’t touched here, but hopefully
this can provide a stable Wait-but-Why-style
tree trunk of understanding for anyone looking
to add a few more branches with further reading.
Like I said at the start, one of the motivations
behind this video is that a lot of money has
started flowing towards cryptocurrencies,
and even though I don’t want to make any
claims about whether that’s a good or bad
investment, I do think it’d be healthy for
people getting into this game to at least
know the fundamentals of the technology.
As always, my sincerest thanks those of you
making this channel possible on Patreon.
I understand not everyone is in a position
to contribute, but if you’re still interested
in helping out, one the best ways to do that
is simply to share videos that you think might
be interesting or helpful to others.
I know you know that, but it really does help.
I also want to thank Protocol Labs for their
support of this video.
This is an organization that runs a number
of research and development projects, and
if you follow the links I’ve left in the
description to read into the details of these
projects, you’ll notice some strong parallels
with the concepts covered in this video.
The challenges and benefits of decentralization
are by no means limited to currency and transaction
histories, and the usefulness of tools from
cryptography like hash functions and digital
signatures are likewise much more general.
For example, a couple of Protocol Lab’s
projects, such as IPFS and Filecoin, center
on distributed filestorage, which opens a
whole field of interesting challenges and
possibilities.
For any developers out there, Protocol labs
places a high value on open source, so if
you’re interested you can join what’s
already a very strong community of contributors.
They’re also looking to hire more full-time
developers, so if you think you might be a
good fit for some of these projects, definitely
apply.

100 Comments

Add a Comment

Your email address will not be published. Required fields are marked *