Bitcoin and Anonymity – Crypto Academy Lecture 6


CryptoSlo cryptocurrency news and
investing
hello and welcome to the sixth lecture
by now you’ve seen a lot of the basics
of Bitcoin how the system works how
mining works and how to use Bitcoin as a
currency now let’s get to what has been
one of the most controversial aspects of
Bitcoin which is the anonymity
properties of Bitcoin and in fact
there’s a lot about Bitcoin and
anonymity that you’ll hear different
opinions on is Bitcoin anonymous first
of all our anonymous cryptocurrencies
even a good thing is it good for people
who have a stake in Bitcoin isn’t good
for society and what are the various
various proposals that have been made to
improve bitcoins anonymity how well do
this work which of those should we adopt
and so on so in this look this lecture
what we’re going to do is help cut
through all of that confusion and we’re
gonna discuss where things are what are
the options and where things seem to be
going so let’s start like this let’s
start with a basic understanding of what
we even mean when we say anonymity in
Bitcoin and some of the overall concepts
like how does anonymity tie in to
privacy is that a good thing or a bad
thing can we only have the good aspects
of anonymity without the bad a variety
of questions like that and then we’ll
see a variety of proposals some already
existing and some that may be
implemented someday for improving
bitcoins anonymity or creating different
anonymous cryptocurrencies altogether
and what’s interesting about them is
that they offer a variety of increasing
levels of cryptographic sophistication
as we go down this list and we’ll learn
to see what the trade-offs are and
analyze the anonymity properties how
deployable these are and so on alright
let’s get started if you look online
you’ll see there are a number of people
in groups saying that bitcoin is
anonymous there’s no shortage of
opinions on this let me just pull out
one quote in particular this is the
WikiLeaks donation page it says in plain
and simple terms bitcoin is a secure and
anonymous currency is that actually true
well you’ll also find a variety of
opinions to the contrary again I’m just
pulling out one example this is the
wired UK saying a Bitcoin won’t hide you
from the NSA is prying eyes so how can
we resolve this confusion
let’s let’s look at what the word
anonymous means add quite a literal
level anonymous means without a name and
so what does that mean exactly well
there’s two ways to interpret it we note
that in Bitcoin addresses are public
keys you don’t need to put in your real
name in order to interact with the
system or public key hashes instead of
real identities but we can interpret
this property of being without a name in
two different ways
we can interpret it as interacting
without your real name or we can
interpret it as interacting without any
name at all now if you interpret it as
interacting without your real name then
certainly bitcoin is anonymous in that
sense but we do have these public key
hashes that act as some sort of pseudo
identities and so when computer
scientists look at the situation they
don’t use the term anonymous to describe
this they call the pseudonymity and
there’s a very clear difference between
the two and it’s an important one and
we’ll see why in a second you might
wonder yeah even though you’re using a
pseudonym which is your public key hash
you can create any number of them you
can have as many student names as you
want does that make it anonymous well
the answer is not quite and we’ll get
into that as well okay so if computer
scientists called a pseudonymity what is
anonymity then is there a clear
definition of what it would take for
something to be called anonymous at a
conceptual level the answer is very
simple anonymity in computer science is
just pseudonymity together with unlink
ability so what is this property called
unlink ability at an intuitive level
we’ll get into better definitions in a
little bit but at an intuitive level
what unlink ability means is that as a
user interacts with a system repeatedly
these different interactions should not
be able to be tied to each other
from the point of view of some adversary
so you have to be talking about a
specific adversary for this to even make
sense
now this distinction here between full
anonymity and mere pseudonymity is
something that you might be familiar
with from a variety of other contexts
and one good way that I like to explain
this is to look at online forums
and again here the distinction between a
mere pseudonymous interaction and
anonymous interaction comes up in
different forums and reddit is a good
example of a forum where you pick a
long-term pseudonym and interact over a
period of time with that pseudonym you
could create different pseudonyms but
it’s going to be practically infeasible
to create a new pseudonym every single
time you want to post a comment and it’s
not even very meaningful so reddit
offers two Dynamis interaction the
opposite of that fully anonymous
interaction where you can make posts
with no attribution at all is the model
that you typically have in 4chan and
there’s a similar difference in Bitcoin
as well in Bitcoin is in the
pseudonymous model more than the
anonymous model okay but let’s talk
about why this difference is important
in Bitcoin why is a mere pseudonymity
not sufficient if you want privacy after
all if you have pseudonymity it seems
like even if somebody can create a
pseudonymous profile of all of your
interactions on the system they can’t
tie it back to your real identity well
here’s the answer to that it turns out
that if you have this pseudonymous
profile it’s pretty fragile it’s very
easy for it to get linked back to your
real identity at some point and if that
happens at any point then of course all
of your transactions past present and
future have been linked to your identity
so here are a couple of different ways
in which that can happen one is that a
variety of Bitcoin businesses online
wallet services exchanges and others
even vendors in a lot of cases are going
to want to your real life identity in
order to letting you transact with them
consider this analogy you go to a coffee
shop you pay for your coffee with
bitcoins and of course if you’re there
in the store then the person who’s given
you your coffee sort of knows what who
you are even if they don’t actually ask
for your real name and so your physical
identity does get tied to one of your
Bitcoin transactions and if that Bitcoin
transaction then gets tied to all of
your Bitcoin transactions then that is a
complete violation of anonymity so this
notion if a pseudonymous profile is very
fragile it could easily get compromised
in a variety of ways and all
so even if such a direct linkage doesn’t
happen these linked profiles can be d
anonymized due to side channels what do
I mean by side channels well here’s
something that I find intriguing that
might seem like a tall claim but in fact
such things have been known to happen
maybe somebody looks at a profile of
your pseudonymous Bitcoin transactions
and finds that you interact at certain
times of day and they’re able to
correlate the times of day when you’re
active online with the times of day when
your Twitter account is posting tweets
and so they’re able to find a connection
between your Twitter identity and your
transactions on Bitcoin similar attacks
have been known to happen
so this is why this notion of a
pseudonymous profile is considered quite
fragile and for real anonymity we want
the stronger notion of unlink ability so
let’s try to define it in a little bit
more concrete sense what unlink ability
means in the context of a Bitcoin and we
can do that in a variety of different
ways one is that it should be hard to
link together different addresses of the
same user another is that it should be
hard to link together different
transactions made by the same user both
of these seem intuitive look at this one
though it should be hard to link the
sender of a payment to its recipient
this one might sound a little confusing
at first because if you interpret a
payment as a Bitcoin transaction then of
course that transaction has inputs and
outputs and these inputs and outputs are
inevitably going to be in the blockchain
publicly and linked together and so you
might think that this is impossible to
achieve but if we interpret this notion
of payment in a different way not as a
single direct Bitcoin transaction but
perhaps and in their indirect sort of
payment that goes through a circuit as
route of transactions then one might
imagine that the ultimate sender and the
ultimate recipients of that payment
might not immediately be linkable
looking at the Bitcoin blockchain so
these are all somewhat more concrete but
still at an intuitive level varieties of
unlink ability that one might want to
shoot for but if you look at this last
definition it might still be not
entirely convincing let’s say that you
pay for a particular product and it
costs a certain amount of Bitcoin
then maybe you send that payment through
a circuit as route of transactions but
still you might think somebody looking
at the blockchain must be able to infer
something specifically that bitcoins
left some address a certain number of
bitcoins and bitcoins showed up at some
other address and these two might be
slightly different because of
transaction fees and so on but roughly
equal and also roughly had the same in
the same time period because there can’t
be too much of a lag between the sending
and there and the receiving of a payment
and so clearly even if we try to achieve
this kind of unlink ability it can be on
link ability between all possible
transactions but some smaller subset of
transactions that look like each other
so let’s make this a little bit more
concrete now and this is how we quantify
anonymity we usually don’t try to
achieve complete unlink ability which is
unlink ability among all possible
transactions or addresses in the system
but instead we go for something more
measured we try to maximize the size of
our anonymity set the anonymity set is
the size of the crowd of other addresses
or transactions that we’re trying to
hide in so if I can be reasonably sure
that with respect to some adversary
there are these thousand other
transactions that look just like mine
and the adversary can’t tell which one
was mine then that we might consider to
be a pretty good level of anonymity and
to calculate this anonymity set it’s not
trivial at all it takes a few steps you
have to first define concretely what
your adversary model is and you have to
reason carefully about what that
adversary knows what they don’t know and
what they cannot know and there’s no
general formula for doing this it
requires carefully analyzing each
protocol and system and doing it on a
case-by-case basis I want to point out
that in the Bitcoin community often
people carry out intuitive analyses that
found Mehmedi services for example
mixing services that we’re gonna see
later in this lecture and often they
come up with ways like taint analysis
this is an intuitive way that tracks the
flow between a particular sending
address in a particular receiving
address and intuitively it might make a
lot of sense but if we consider it from
the point of view of how we actually
should calculate anonymity
taint analysis is not a very good
measure of how much anonymity you get
from a system and the reason for that it
is is that it assumes a particular type
of attack the adversary might might
carry out a rather naive attack looking
directly for quantities of flow between
ascending and a receiving address and if
your adversary we’re a little bit
cleverer than that then you might carry
out taint analysis and think that you
have a lot of anonymity in a certain
situation but in fact you might not so
the bottom line from this slide is that
quantifying anonymity must be done in
terms of the anonymity set and in some
cases probability distributions on top
of that anonymity set and it requires a
careful analysis of the protocol in the
system you can’t apply a simple formula
okay let’s switch gears a little bit and
talk about the ethics of anonymity why
do people want anonymity we’ve already
seen a little bit the connection between
anonymity and privacy but let’s make
that very concrete now in blockchain
based currencies because all
transactions are recorded on the ledger
they’re totally and publicly and
permanently traceable and so if your
identity ever gets linked to these
transactions you’re in a situation where
your privacy level is much worse than
you get with traditional banking why
because anybody might be able to carry
out this type of Dien atomization attack
not specifically a company or a
government that you might be worried
about any member of the public and your
transactions since they’re permanent
your loss of anonymity years down the
line could affect all of your
transactions today and vice-versa so we
really want anonymity to even get the
privacy level of cryptocurrencies to the
level that we enjoy with with the
traditional system but also people hope
that it can give us a new level of
privacy of course we have to acknowledge
the concerns as well and one of the
major concerns is money laundering and
all of the bad things that that can
enable so let’s talk about that this is
definitely a legitimate worry I wouldn’t
be in favor of studying anonymity and
cryptocurrencies and ignoring the
ethical aspects and saying oh that’s not
something I’m going to worry about I’m
only interested in the technology I
think it’s important to consider the
ethical aspects there’s one item of
comfort
that I will offer though if you look at
how things stand currently in Bitcoin
the difficulty of things like money
laundering is not necessarily because
the blockchain is not so anonymous and
so it’s easy to trace flows but instead
the difficulty stems much more from the
fact that moving large flows into and
out of the currency rather than within
Bitcoin is what is really hard in other
words cashing out is hard and so
anti-money laundering efforts have great
promise if they’re focused in this part
of the system and the good news is that
all of these attempts to improve
anonymity in Bitcoin don’t affect this
part of the equation that in any way and
so I would recommend that Bitcoin
researchers and developers coordinate
efforts with anti-money laundering
efforts by law enforcement and others so
that’s the technical aspect of Bitcoin
anonymity can be relatively separate
from law enforcement and legal aspects
and so on nevertheless one could try to
ask can’t we design the technology in
such a way that only the good uses of
Bitcoin anonymity are allowed and the
bad uses are somehow permitted
well this turns out to be a quite common
conundrum in computer security and
privacy and a lot of scenarios we want
something like this but it never turns
out to be possible why because these
different uses that we’re talking about
that we perceive as being very different
morally
are going to be almost identical
technologically and if we want to encode
some sort of moral rules into the
technical rules of the system that are
going to be automatically enforced by
miners it’s not even clear how to do
that and so hence my recommendation of
separating out the technical anonymity
properties of the system with the legal
principles that we put on top of it now
in terms of how people use that currency
it’s not a completely satisfactory
solution but it’s perhaps the best way
we have of trading off the good with the
bad I do want to point out but that this
is far from the first time that we’re
considering this dilemma it’s come up in
the context of Tor an anonymous
communication network and anonymous
communication enables bad actions that
as much as anonymous moving of funds
does and satorious really had to grapple
with this problem in a very simple and
single picture tour as a communication
network that routes messages between a
sender and a receiver through a network
of nodes but further through some clever
encryption ensures that as long as at
least some of the nodes in that network
are honest then the adversary is not
going to be able to link the sender to
the receiver so that’s what toward us
and you can see how it can enable a lot
of bad activities let’s look at some
activities good and bad that do happen
on the Tor network it’s used for us to
fall by normal people who want to
protect themselves from being tracked
online by marketers or various other
privacy properties online when they’re
browsing websites it’s used by
journalists and activists and dissidents
and so on and so that’s clearly an
important use case it’s also used by law
enforcement because if they wanted to do
an electronic sting operation then you
want to be able to visit websites
without revealing that your IP address
is coming from a law enforcement block
so clearly a lot of activities that we
might approve of but it’s also used by
botnets for example for a spreading
malware between nodes in the network and
unfortunately there is also child
pornography in the network so
distinguishing between these uses at a
technical level is essentially
impossible
and so tor has grappled with this issue
and as a society we have grappled with
it and by and large we’ve concluded that
it’s better for the world that the
technology exists then it doesn’t and in
fact one of the main funders of Tor as
the US State Department they’re
interested in it because tor helps
dissidents in other countries who might
be fighting oppressive governments and
so on and in fact recently there was a
news story about the FBI having a
successful string of sting operations
against people using tor for child
pornography and so of course we have to
remember there is a level above the
technology that law enforcement can
exploit a variety of ways to get to
people who are using these systems for
bad purposes and so it preserves a sense
of balance so let’s switch gears a
little bit once more
let’s look at the history of anonymous
e-cash even though with Bitcoin these
questions are quite controversial and
there are debates about how Anonymous
exactly Bitcoin is and what are the
options and so on this is not not the
first time that we have thought about
anonymous cryptocurrencies at a
technical level these efforts have quite
a long history in fact all the way back
in 1982 more than two decades ago
cryptographer David Chum proposed
something called blind signatures that
helped him develop anonymous electronic
cash so what are blind signatures blind
signatures are a two party protocol two
parties communicate with each other and
at the end of that one party has
produced a digital signature of some
input without actually knowing with that
in fidus I know it sounds a little bit
like magic but I encourage you to look
it up it’s not that sophisticated at a
technical level it’s it’s quite simple
to understand if you work through the
details but since I’m not actually going
to go into the details now let’s for the
moment assume that it this works by
magic so assuming that we have blind
signatures how can that help us achieve
an electronic cash protocol that’s what
David chomp did and as we go through
this protocol try to see if you can spot
any other flaws with that other than the
anonymity properties or lack thereof
it’s quite a simple protocol I’m going
to show it to you in just one slide
now imagine that there is a bank and
this is a protocol for anonymous e cash
through blind signatures imagine that
there is a bank and the bank stores
various things in its database in
particular it stores these two tables
the first table has a mapping of users
with the balance that they have in their
bank account these balances don’t refer
to any sort of cryptographic currency
it’s just a plain old number sitting in
a database just like your actual bank
account or PayPal or something like that
in addition it has another table called
spent coins and you’ll see in a moment
what this means let’s say that a user
now wants to withdraw an anonymous coin
from the system and now this is where
the crypto magic is going to come in
so the user wishes to withdraw an
anonymous coin of a standard
denomination let’s say that that’s $1
nomination and all of these values refer
to dollars so the first thing that the
bank is going to do on receiving this
request is deduct as users balance it’s
gone down from ten to nine in this
example the next thing the user in the
bank are going to do together this
executes a two party protocol a blind
signature protocol at the end of which
the user having picked a random serial
number of a coin that’s what’s being
depicted here this is a serial number
for an anonymous coin and the user was
completely at liberty to pick that
number she did and then they executed a
protocol at the end of which the user
has received a signature of the serial
number but in such a way that the bank
did not in fact learn the serial number
the bank had no idea what number it was
signing it just knew that it was some
number that had signed it and now this
signed number represents an anonymous
token this is a token that the user can
pass around to another user so let’s say
that she wants to make a payment to
another user what she’ll do is send to
that user
not only the signed token but also the
plaintext value of the token of the
serial number and what the receiving
user will do immediately is the
following she will immediately contact
the bank and try to deposit this
anonymous Quinn because without actually
trying to deposit it this red user here
cannot be sure that the blue user is not
trying to double spend the blue user
could be sending that same anonymous
coin to a hundred different users how
can they know that they’re not being
tricked into accepting a double spent
coin the way they’re sure is when the
red user receives the coin they have to
immediately contact the bank to verify
if it’s valid or not and only if the
coin turns out to be valid will the red
user proceed to complete the rest of
whatever transaction she was having with
the blue user so the bank now receives
the message to deposit the coin and note
that it now gets finally the plain text
serial number as well as its own
signature
the bank looks at the signature verifies
that it’s a valid signature and here’s
the key thing it also verifies the
serial number that had received is not
on the list of spent coins that’s how it
knows that this is not a double spent
attempt this is the legitimate first
spend of a coin that the Bank signed
before so it’s a legitimate anonymous
token and since the bank didn’t see the
serial number the first time around the
bank does not know which user initially
would true this anonymous coin and
that’s the key anonymity property in the
period of time between the blue user
withdrawing this coin and then perhaps
much later sending it to the red user
who immediately deposits the coin many
other pairs of users might have
deposited and withdrawn coins and the
bank has no way to tell them apart so
coming back to this part of the protocol
the bank verifies that this is a new
serial number that it’s seeing for the
first time it puts that serial number
into its list of spent coins so that it
cannot be spent any more and that adds
one dollar or whatever the denomination
is to Red’s account and then sends back
a message saying this is okay and now
the red user has verified that they
received a legitimate anonymous coin
from the blue user and can now proceed
to complete the transaction so this is
the entirety of a very simple anonymous
electronic cash scheme and the key
property here is that the bank cannot
link the two users so I asked you to
think about whether this has any
drawbacks other than anonymity and of
course the glaring thing that you
probably noticed is that all of this
depends upon trusting this Bank I mean
look at this part of the system this is
simply the bank keeping numbers in its
database of who owns how much money
right so this is this seems to be a
trust model that’s very very different
from the model that Bitcoin operates
under so a lot of the traditional
cryptography research on anonymous ecash
was in this model where you were willing
to trust a bank for many things
including keeping your money
but you were not willing to trust a bank
with with that anima T you wanted to be
sure that the bank didn’t know who was
interacting with whom okay it’s a it’s
an interesting model it’s a valid model
and many such schemes were developed
under this model but in retrospect it
seems to have been that the
decentralization problem was a much more
important one to solve than the
anonymity problem in order for anonymous
electronic cash to become successful
people were willing to accept a
decentralized e cash system with only
sort of pseudonymity properties and not
real anonymity and then get to work on
maybe improving the unanimity instead of
starting from a fully provably anonymous
electronic cash system that relied on a
single central authority but more
generally anonymization and
decentralization as we’ll see repeatedly
in this lecture are in conflict with
each other there are at least a couple
of reasons for this one is that as we
saw in the last slide often for
anonymity you might want to rely on
certain interactive protocols with a
bank in order to do some blinding which
we saw in blind signatures that’s where
you get anonymity from so but how are
you going to do that without a central
bank to carry out that protocol with
it’s not clear but even if you got rid
of this blinding and we’re willing to
accept just pseudonymity instead of true
anonymity you still have the problem
that in order to decentralize and still
get security properties like resistance
to double spending often the way to go
is to record and trace everything in a
public ledger as Bitcoin does and so you
might even further compromise your
anonymity and privacy properties so
these are two big challenges to overcome
and as we’ll see much later in this
lecture a 0 coin and zero cash are
cryptographic anonymous decentralized
electronic cash schemes that have some
similarities to the blind signature
based protocol that I showed you earlier
but some of the giant challenges that
they have to tackle involve these two
limitations
all right I said several times earlier
that bitcoin is only pseudonymous and so
all of your transactions or addresses
could get linked together let’s now go
in and see how that might actually
happen let’s in fact start from
WikiLeaks again I showed you a quote
from them saying bitcoin is secure an
anonymous digital currency and this is
actually the page that that was taken
from this is their donations page and
here you’ll see that in addition to this
blurb about Bitcoin being secure and
anonymous they have a donation address
over here
this is of course the hash of a public
key you’ve seen things like this in
previous lectures but they also have
this interesting refresh button right
next to that what do you imagine this
refresh button might do well as you
might expect if you click on that
refresh button it’ll give you an
entirely new donation address let’s go
in and take a look at that so I totally
new address popped up on the page so
what is going on here what WikiLeaks is
doing is it’s making sure that each time
a person visits the page each time a
person wants to visit the page and make
a donation they send that donation to a
totally new public key that WikiLeaks
creates just for that purpose
so here WikiLeaks is taking advantage of
the ability to create new pseudonyms new
public keys to the maximum every single
transaction that they receive they want
to receive it at a new address and in
fact this is the Bitcoin best practice
for anonymity to always receive new
transactions at a fresh address so you
might look at this and think surely then
these different addresses must be on
linkable you receive a transaction over
here and then much later you spend it by
sending it to somebody else you receive
another transaction at this address and
then you send it to someone else over
there so how might somebody link well
here’s the key let’s let’s imagine the
scenario
Alice a customer goes to a big-box store
and wants to buy a teapot so in the
scenario Alice has a few bitcoins lying
around with these different
denominations
and the store lists the teapot for a
price of eight bitcoins it’s a pretty
expensive teapot at today’s exchange
rates so imagine that’s santa bitcoins
or something if you like any rate Alice
has these different addresses and wants
to pay for the teapot how was she going
to accomplish this she doesn’t actually
have an address with eight bitcoins
sitting in there and so what she’s going
to do is she’s going to combine several
different input transactions into a
single transaction in order to pay eight
bitcoins to the store
so this reveals something for somebody
who’s looking at this transaction that
gets recorded permanently in the
blockchain they’re going to think AHA
two different inputs to this transaction
that could only happen because both of
these input addresses are under the
control of the same user they were able
to use their wallet software to create a
transaction that combined both of them
into one so in other words shared
spending as evidence of joint control of
two different addresses and it doesn’t
stop there this is not just about
linking two different addresses that are
inputs to a transaction you can do that
transitively and every time alice has a
whole cluster of addresses that have
been linked and then she creates a new
transaction that combines one of those
addresses with a new address you can add
this new address to the cluster so this
is the first insight behind being able
to link transactions together and we’ll
see later on that an anonymity technique
called coin join works by violating
exactly this assumption but if you
assume that people are just using
regular Bitcoin wallets software not
doing anything special on top of it then
this technique tends to be pretty robust
and this has been explored in a variety
of research papers and as it just a note
about this lecture a lot of what we’re
going to be discussing today gets into
the frontiers of where the research
knowledge are so a lot of this the state
of the art may have advanced in a few
months or a few years so every time I
talk about a technique that we know from
a particular research paper I’ll give
you a reference to that paper so that
you can look it up you can look up
papers that cited
you can build up that knowledge on your
own now in particular one of the papers
that use this technique used it for a
particular purpose there was a well
publicized Bitcoin theft a few years ago
and what they wanted to do the authors
of this paper decided to see how this
thief has been moving bitcoins around
between multiple addresses episode and
so this is that paper in question that’s
called an analysis of anonymity in the
Bitcoin system and so this is one of the
first major research efforts that did
what we call transaction graph analysis
so you can use the techniques that I
showed you in previous slides and you
can draw a lot of these pretty graphs
and deduce that this represents the
thief moving money around between I has
owned different addresses this is the
thief sending money to someone else and
various things like that
I haven’t yet shown you anything that
allows you to link any of these clusters
to real-world identity but let’s defer
that question for a bit
let’s defer that question and go back to
the scenario of Alice and the teapot so
let’s look at it again maybe the teapot
has gone up in prize 28.5 santé bitcoins
so what is I was going to do now
she can’t combine any subset of her
transactions or her addresses to produce
the exact amount of change necessary for
purchasing this teapot so instead what
she’s going to do is exploit the fact
that transactions can have any number of
inputs and outputs and create a single
transaction that looks like this it
combines these two inputs to produce
this output that goes over here and
another output that goes to an address
that she herself owns and this is called
a change address which you saw in a
previous lecture this presents a
conundrum for an adversary who’s looking
at this the adversary might be able to
deduce that these two addresses belong
to the same user he might suspect that
one of these addresses also belongs to
that same user but has no way of knowing
which one that is in this particular
example the change address is a small
amount but it doesn’t have to be that
way at all Alice might own an address
that has 10,000 bitcoins and might spend
a little bit on the teapot and might
send most of the rest of it back to
in her own change address and these
transaction outputs don’t have any
particular ordering in the back chain
they were and in the blockchain that
order is not meaningful at all so it’s
not clear what the adversary might do
it’s not clear how the adversary might
determine which addresses change in a
multi output transaction so what is the
adversary to do there is another pretty
cool technique for this again from a
research paper which I’ll tell you about
but the technique is this the authors
call this idioms of use and they exploit
idiosyncratic features of different
wallet software for example one thing
they found is that most wallet software
use an address as a change address only
once that means that this in fact seems
to sort of follow Bitcoin best practice
for anonymity in a sense if you have a
new transaction where you need to create
a new change address don’t use an
address that you’ve already used before
as a change address create a new address
and use it for this purpose right now
not all addresses that are outputs of
transactions might have this property
going back to the example of the big-box
store the store might advertise a long
term address at which it wants to
receive bitcoins instead of receiving
bitcoins at a different interest every
time so not every non change address has
this property that it’s used only once
as an as as a change address but every
change address does have that property
so they use this and they found that it
works pretty well on the other hand this
has some limitations it’s it just
happens to be a feature of wallet
software and so there are a lot of false
positives that might creep in to these
clustering techniques if you use
techniques like this so it required a
lot of manual intervention nevertheless
they were able to use the technique that
I showed you before which is clustering
shared inputs together as well as a few
heuristics for a change address
detection and then what they were able
to do is they were able to look at the
entire Bitcoin transaction graph and
create some giant clusters that they
hypothesized belonged to various major
service providers and here’s what that
graph looks like after applying these
two heuristics and this is the paper in
question this is by Sarah Mikkel John
and others as a whole bunch of authors
of this paper now this graph looks very
interesting here these sizes of these
circles represent the amount of money
flowing into those clusters and the
number of edges going out of a cluster
represent the number of transactions
let’s try to just stare at this for a
second and see if we can guess what some
of these major service providers and
other clusters of nodes might be this
huge one here that dominates in
transaction volume compared to any other
cluster given that this paper was
written in 2013 we might guess that it’s
Mount Cox which was a very prominent
exchange of the time at the time that
later went under we might also guess
said this little one here that only has
a little bit of transaction volume in
spite of having a very large number of
transactions sort of corresponds to the
profile of the gambling service Satoshi
dice because the way that it works is
you send a tiny amount of bitcoins and
you either win that bet or you lose that
bet and so you might get double the
bitcoins or none of the bitcoins so
that’s the gambling service otoshi dice
we might guess that it’s this one here
we might guess that it’s about dogs and
so on
but this kind of guessing is suboptimal
the authors wanted some sort of reliable
way of identifying what are the service
providers corresponding to each of these
clusters how did they do that
well one idea you might have is you
might think oh why not just go to the
mod Cox website and see what address
they advertised for receiving bitcoins
well that doesn’t quite work because
they’re going to advertise a new address
for every single transaction and if you
just go to the website look at the
address and actually don’t complete that
transaction you don’t send bitcoins
there then they’re simply going to
discard that address they’re not going
to reuse that address for another
customer in other words that address
will never get used you simply won’t
find it in the blockchain so what’s the
way around this well the only way to
reliably infer addresses that are
associated with a service provider is to
actually transact with that service
provider which is exactly what the
authors did they they went ahead and
bought a variety of things and
interacted
variety of other ways with a bunch of
service providers comprising 344
transactions in all mining pools twilit
services exchanges various merchants
even gambling sites and so on and they
got a bunch of cool things to show for
their efforts and Mikkel John informs me
that in fact the cupcakes were really
good at any rate and the author’s used
this very clever technique to go ahead
and label the major clusters in the cat
and the graph that I showed you on the
previous slide and so this is what the
labeled graph looks like now in fact
this was mound Garza’s we might have
guessed this was Satoshi dice but a lot
of the others would have been very
difficult to guess and by actually
transacting with these services they
were able to identify most of these
service providers so already now we’ve
seen something pretty interesting beyond
just clustering and being able to put
labels on the clusters so the next
question is sure you can do these labels
for these major service providers can
you put labels for individuals in other
words connect little clusters
corresponding to individuals to their
real-life identities well there’s a at
least a couple of different ways in
which that can happen one is intuitively
what I told you right at the beginning
you could simply interact at a coffee
shop or with some other merchants so
they’ll earn some transaction or some
address that corresponds to you and they
might use that to tag your cluster there
are at least a couple of other ways in
which this might happen and one is that
there’s high centralization in these
service providers so the intuition here
is that most users in the course of
normal usage of Bitcoin over a period of
months or years are going to interact
with at least one of those major service
providers that were labeled in the
previous graph so if somebody wants to
identify a cluster corresponding to a
particular user there’s a very high
chance that they’re going to be able to
identify a transaction that ties that
cluster with a known labeled cluster and
then they can go to that service
provider and if they have the
appropriate authority subpoena that
service provider or if they’re a hacker
try to hack into that service provider
and so on and so this is one major
avenue in which regular users can get
denied
because they eventually inevitably
interact with one of these major easily
identified service providers another one
is simply carelessness a lot of users
end up posting address information in
forums they might post one of the
Bitcoin addresses that they own for
example to receive donations when
they’re posting comments on forums now
that might be because these users are
not worried about getting D anonymized
it could also be because they don’t
realize that posting one of their
addresses is almost going to inevitably
allow somebody to connect all of their
different addresses together okay so
hopefully I’ve convinced to you that
there are clever ways that an attacker
might utilize in order to not only link
different addresses or transactions
belonging to you user but go from there
to real world identity and our
experience our history of these D
anonymization algorithms shows that they
only get more powerful with time and
more actually early information as we
call it for attackers to utilize in
order to link together to get to users
identities so this is something to worry
about if you care about privacy before
we look at how to make things better for
anonymity let’s look at a completely
different way in which users can get D
anonymized so far what we’ve looked at
is all based on what is available to the
attacker in the blockchain right the
part that is permanently and publicly
recorded but recall that that’s not the
only part of Bitcoin there’s also a
peer-to-peer network in which a lot of
messages are sent around that don’t
necessarily get permanently recorded in
the blockchain
so the blockchain and networking
terminology is called the application
layer and the peer-to-peer network is of
course the networking layer and so D
anonymization can happen at this totally
different layer at the networking layer
well how could that happen here’s an
example this was first pointed out by
Dan Kaminsky a few years ago in a talk
at blackhat here’s the peer-to-peer
network what he noticed is that when a
node creates a transaction and wants to
broadcast it it’s going to connect to a
lot of nodes at once
broadcast a transaction and so if a few
nodes on the network put their heads
together they can figure out that hey
this new transaction this is the first
we heard of it and all of us first heard
of it from this particular node so this
must be the node this must be the IP
address corresponding to the user who
created this transaction so here you
have a linkage not between a transaction
or a cluster and a real world identity
instead you have a linkage between a
transaction and IP address and of course
IP address is something that’s very
close to real-world identity there are a
lot of ways to go from there to the next
level of finding identity so this is
already a serious problem luckily though
this is not a very hard problem to solve
why because this is now a problem of
communications anonymity and
communicating anonymously as a problem
that has received a lot of attention
from the research community and as we
already saw in the introduction there is
a good system called tor that you can
use for communicating anonymously now
there is one little caveat Taurus
intended for what is called low latency
activities such as web browsing where
there is a large volume of flow and you
don’t want to sit around waiting for too
long and you get the response
immediately so it makes some compromises
in anonymity in order to achieve low
latency Bitcoin is inherently a high
latency system right because it takes a
while for transactions to propagate
through the network and especially to
get confirmed in the blockchain so we
don’t have this low latency constraint
so it’s possible that we could come up
with a more specific fine-tuned sort of
anonymity Network for this particular
purpose and there are such thing things
called mix Nets the only problem is that
tor is a system that’s most widely
deployed and analyzed and robust and
functional today but it’s possible that
somebody might develop a mix net
solution for anonymizing your Bitcoin
communications and if that happens that
would be something to switch to so let’s
summarize what we’ve learned so far
we’ve seen that based on the information
in the blockchain there
and Ursus could Gatling together could
also get linked to identity we’ve also
seen that based on the information and
the network layer a transaction or
address could get links to your IP
address luckily this latter problem is
simple to solve if you care about your
anonymity and privacy when using Bitcoin
it’s a good idea to do it through tor
but the former problem is much trickier
and that’s what we’re going to spend the
rest of this lecture talking about
so there are a variety of solutions to
in a bit what we’ve been calling
transaction graph anonymization or a
transaction graph analysis pardon me and
the first of them is called mixing so
what is mixing well the intuition behind
this is very very simple it’s the same
intuition that comes up in a lot of
context which is that if you want
anonymity use an intermediary to route
your communications or your funds or
whatnot so let’s look at what that might
look like visually here’s an
intermediary and in a second we’ll get
to who these intermediaries might be but
assume that there is some intermediary
some service that allows users to put in
bitcoins but the key property that it
gives you is that after these bitcoins
have been put in it forgets who put them
in and treats its entire store of
bitcoins as indistinguishable from each
other and in fact it might further
combine them all into one giant
transaction or it might further mix them
or I split them and merge them in
different ways whatever but the key
property is that when users later come
in to withdraw their bitcoins it’s not
tied to the coin that they put in
they’re going to get some other say
randomly picked deposit that the that
the intermediary received so when these
three users come back they’re going to
withdraw these coins in a random order
and so somebody looking at this in the
blockchain who doesn’t have the records
that the intermediary might or might not
store just from the publicly available
information on the blockchain is not
going to be able to link the ultimate
input addresses to the ultimate output
addresses corresponding to the same user
so that’s the intuition behind
intermediaries now looking at this does
that strike a chord have have we seen in
previous lectures something that offers
services that are similar to this that
allows you to deposit bitcoins and then
withdraw them later at a later time you
might recall that this is exactly what
online wallets do there are services
where you can just store your bitcoins
online until you need them and so you
might wonder well is that the solution
to our problems do online wallets
provide anonymity let’s think about that
the
this is not obvious but I will start by
mentioning that it’s taken well-known
researchers by surprise here was here
was a post on the New York Times blitz
blog reporting on a preprint of a paper
released by two Israeli researchers
saying that there was a link between
Dread Pirate Roberts the pseudonymous
creator of Silk Road which we’re gonna
see more about and Satoshi Nakamoto this
was of course very surprising but as it
turned out all that had happened was
that they had mistook this link that
went through an intermediary and that
intermediary just turned out to be mount
Cox which you can think of sort of as an
online wallet service and so a few days
later this other post was published at
the same venue see if he can spot the
difference they had to retract their
study and I think they had made a very
simple mistake of not accounting for the
presence of the centre meteor so it’s
clear that at least in some sense online
wallets provide some sort of anonymity
because at least somebody tried to make
a connection between an input and an
output address and completely failed at
that so let’s try to understand exactly
the sense in which online wallets
provide anonymity and I think a good way
to do that would be to in fact contrast
online wallets with the online services
that exists specifically for the purpose
of acting as these intermediaries for
anonymity and those are going to be
dedicated mixing services we’ll talk
about mixing services in much more
detail but very briefly are the two
things that they promise that you won’t
get simply by putting your bitcoins into
an online wallet and retrieving that
again is that they promise not to keep
records it’s not just that as a side
effects they sort of randomly give you
bitcoins that came from some other
address but they specifically say that
they won’t keep records and so even if
they’ve tried to they wouldn’t know
which bitcoins were the ones you put in
and so with the high probability you’re
going to get some other bitcoins back
and furthermore even if someone came
knocking for their records or if they
got hacked and so on there would be
nothing to find there would be no
records so that’s something that I’m
mixing service promises and the other
thing is that you don’t need your
real-life identity in order to interact
with these services and this is in
contrast to most of these online wallets
why because online wallets are typically
reputable and in fact often regulated
businesses and this fact has two
consequences one is that they’ll
typically require your identity in
banking there is the know your customer
principle which essentially at a
technical level translates to learn the
customer’s identity and store those
records and in fact they will keep
records if they receive a deposit they
will keep the link between the identity
and the Bitcoin address if they move
money around internally they will
probably keep records of all of that and
just because when you withdraw your
bitcoins they come from a different
address does not mean does the app that
the online wallet does not know the like
that link probably does exist in their
records and will exist for all eternity
even if they don’t explicitly ask for
your identity think about this to even
interact with an online wallet
you do need a persistent long-term
identity you can’t possibly use a
different pseudonym every time because
if you did they’d have no way of
associating an account with you if
knowing how many bitcoins they owed you
right so because of that even if they
didn’t ask for your identity at the very
least the online wallet knows the
address of every single deposit that you
made of the bitcoins that you put into
the system and more importantly every
single withdrawal that you made and so
when you make a series of withdrawals
from an online wallet and proceed to
spend those bitcoins the wallet service
can now connect all of those together in
a profile and of course it’s not just
the wallet service people who care about
anonymity are also worried about those
records getting hacked insider attacks
somebody who has a subpoena for getting
those records and so on and so forth so
with respect to the wallet service
itself and whoever they might be
cooperating with you have no anonymity
in this context on the other hand there
is something cool about this if you are
willing to trust them with their
bitcoins then what’s going to happen is
you’re going to keep them in the wall
service for much longer than you
typically wouldn’t with a mixed service
why because you don’t trust a mix
service as much you want to put in your
bitcoins and you want to receive it back
immediately from some other address at
it at an address of your choosing
right so unlike that for an online
wallet service you’re going to have a
bigger anonymity sense why because your
anonymity set from the point of view of
someone with no privileged information
from the point of view of someone who is
merely looking at the blockchain your
withdrawal could look indistinguishable
from every single withdrawal ever made
from that service provider so with
respect to the wallet service you have
no anonymity with respect to everybody
else you have a bigger anonymity set
than you possibly would that with using
a mixing service or at least with using
a single mixing service so if we look at
this this looks suspiciously similar to
the kind of privacy properties that you
have with the traditional banking system
there are the centralized intermediaries
that know a lot about our transactions
but from the point of view of a stranger
with no privileged information we have a
pretty good amount of privacy so even if
this gives you some sort of anonymity
it’s almost at best what you get with
the traditional system and so those are
not the kind of people who are typically
looking for anonymity and Bitcoin anyway
if they were happy with their anonymity
properties of the traditional system
they would have probably stayed with
that system and so generally people who
are looking for anonymity properties in
Bitcoin simply did not want to accept
the trust requirements that these online
services online wallet services require
and they don’t want the sort of
anonymity properties that it gives you
they don’t want to have to trust that
service with with their anonymity and in
fact we’ve seen that there have been a
lot of closures of these exchanges and
services and so there’s good reason for
believing that if you put all your trust
in an online service you might simply
lose your money okay so having rejected
online wallets as an anonymity solution
let’s turn to these dedicated mixing
services that I told you a little bit
about before looking at their details
let’s talk about the terminology a
little bit I like to call it a mix some
people call it a mixer
these are really the same thing some
people also call them laundries I don’t
like this term at all and the reason for
this is that it needlessly attaches
moral meaning to something that’s a
purely technical term as we’ve seen
earlier there are very good reasons why
you might want to protect your privacy
in Bitcoin and use mixes for entirely
good reasons for everyday privacy of
course we must also acknowledge the bad
uses but it seems a little bit weird to
me to use the term laundry that implies
that your coins are dirty and you need
to clean them and attaching a negative
immoral value to the whole thing which
and for that reason I’m not going to use
that term in this lecture we’ll go with
the technically neutral term which is
mixing so in talking about mixing there
are several of us about six of us who
got together researchers at Princeton
Concordia and Maryland including all
four of us who are doing this online
lecture series and analyzed the existing
mix ecosystem and proposed a series of
changes for improving the way that mixes
operate both in terms of anonymity and
the trustworthiness of mixes so let’s
look at those principles before I show
you those principles as a quick reminder
at a very fundamental level how does a
mix operate it asks for an address at
which you want to receive bitcoins and
it gives you an address to send bitcoins
to the mix and then you both execute
that transaction it’s a swap basically
in a second I’ll show you what that
looks like visually but what were our
principles for running these mixes
properly well the very first one is that
you might want to use a series of mixes
instead of just a single mix and this is
a very well-known principle using a
series of routers is the same principle
and the anonymous communication system
tour and it’s a good idea because it
allows you to not have to trust a single
mix but instead be sure that as long as
any one of these mixes is promising to
delete its records then you have a good
guarantee of anonymity and in particular
mixes should implement a standard API so
that this can be very easy for clients
to accomplish and right now this is not
quite the case and this this is our
for your reference so now let’s go in
and look at what a series of mixes would
look like visually so here it is here is
a user who starts with a coin or an
input address that we assume that the
adversary has managed to link to this
particular user they’re going to send it
to the mix of this address and get back
a Bitcoin at this other output address
that they provide they freshly generate
this output address and provide that
address to the mix the mix will
hopefully return the same amount of
bitcoins add this output address there’s
no way for the user to force the mix to
do that the user has to trust the mix
and this is as we’ll see a recurring
problem with the whole notion of mixes
and either immediately or after a time
gap it doesn’t matter the user will take
the Bitcoin or bitcoins of whatever
value they’ve received at this address
and send it to a different mix which is
hopefully not cooperating with the first
mix and repeat this process over and
over again
so from an adversary’s point of view
looking at the public blockchain they’re
merely going to see along with all of
these transactions a variety of other
mixed transactions that other users are
executing and well hopefully the
adversary will have no way to tell apart
which of those transactions correspond
to this particular user and which one
corresponds to some other users so
that’s the first principle and the
second one if you think about what I’ve
just said in order to make that possible
you want to make these transactions as
uniform as possible
so that this link ability is minimized
and what does it mean to make these
transactions as uniform as possible
one important consequence is that all of
these mixed transactions not only from a
particular mix but all of the mixes in
this mix mix ecosystem should have the
same value so we think that all mixes
out there providing service should agree
upon a chunk size a standard chunk size
and of course there can be multiple
denominations but there can’t be too
many and you can’t simply allow the
users to put in whatever amount of
bitcoins they wish to that wouldn’t work
so you need this kind of standardization
in addition to this we found that there
are a variety of possible attacks in
which
adversary might infer various things not
just the amount even if you remove the
amount some other properties including
timing for example in order to try to
link users and put addresses and output
addresses together this type of linking
can be avoided but human users if they
interact with the mix are not going to
be able to take into account all of
those possible linking attacks so
instead what needs to be done is this
client-side software must be automated
and built-in to desktop wallet software
so that this desktop wallet software
automatically knows how to interact with
these mixes in order to preserve the
users anonymity so that was our third
principle our fourth principle is a
subtle one now these mixes why do they
provide these at the service typically
it’s because they’re a business and if
they’re a business they want to be paid
how are they going to get paid well it
turns out that pretty much the only way
for these mixes to get paid is to take a
cut of the transaction that the user is
sending to the mix that seems a bit
weird because if a mix takes a standard
percentage then an adversary might be
able to use that to LinkedIn for
transaction in the output transaction so
some current mixes try to randomize the
transaction fee they might say we take a
random cut between 1% and 3% we found
that this is not a good idea either
because if you put that through a chain
of mixes then the amount of the value in
the chunk is going to dwindle in a
predictable way and this is an important
side channel for the adversary so what
is a way to avoid this we proposed that
these mixed fees should be all or
nothing in other words the mix should
either swallow the whole chunk with a
small probability or should return the
whole chunk so if the mix wants to
charge a 0.1% mixing fee this is by the
way very different from the transaction
fee that mining nodes charge this is a
mixing fee on top of that so if the mix
wants to charge a 0.1% mixing fee then
on one out of a thousand times the mix
should swallow the entire chunk and 999
times out of 1,000 the mix
should return the entire chunk without
taking any mixing fee this is a tricky
property to accomplish which means that
the mix should generate a random number
in a way that can convince the user that
the mix is not cheated in generating
this random number and as genuinely
flipped a coin which has you know a 99.9
percent chance of coming up one way
versus the other but we do show how to
how to do this using cryptography in a
way that both parties can be satisfied
and has worked correctly we think that
really all four of these principles are
necessary to have anything approaching
mathematical confidence in having a
large anonymity set and in our ability
to resist clever inferential attacks by
an adversary that looks at the
blockchain to try to link input to
output the sad news is that virtually
none of the current mixes follow these
principles they’re in a very different
model where each mix operates completely
independently and they have a web
interface and the user interacts with
them totally manually instead of
automatically through their wallet
software and will manually put in the
amount instead of a standard chunk size
it’s whatever amount the user chooses
typically and the mix will take some cut
of that as a mixing fee and send the
rest to the user so this is we don’t
think this is a situation that gives mix
users a lot of anonymity but we think
that by moving to a slightly different
model based on these four principles the
anonymity properties of the mix
ecosystem can be dramatically improved
all right so through these four
principles we’ve seen how the anonymity
properties of mixing can be improved but
there is still one major problem which
is that users still have to trust these
mixes so again we had a few ways that we
talked about in our paper for what to do
about this mixes can do several things
to improve their trustworthiness one is
that simply by staying in business for a
long time and not stealing users money
they can build up a reputation you might
wonder just his reputation count for
anything because it’s simply a matter of
he-said she-said in fact a mix operator
can
claim that a competing mix operator
stole all their money even if that did
not in fact happen well generally
reputation systems in the real world
managed to operate even though there can
be conflicting claims that are made in
this context for example users might
learn to only trust the word of
prominent members of the Bitcoin
community who they think have the best
interests of the ecosystem at heart
another way is that in the system that
we proposed the chunk sizes are going to
be so small that in the regular course
of mixing users are going to mix a
pretty huge number of chunks or at least
the system can be configured in that way
so that the chunk sizes are relatively
small so in that context if a mix has
even a 1% probability of stealing a
user’s chunk then after a hundred or so
interactions with small chunk sizes with
a particular mix the user is going to
know the user is going to detect the
theft and so the user will learn to
never use this mix again and so the
system might sort of correct itself by
users testing mixes for themselves for a
trustworthiness an important thing to
keep in mind here is that the chunks
that users are sending to mixes have
typically already been through other
mixes so the mix itself can’t know which
user the chunk is coming from and so the
only thing the mix can do is to
essentially steal randomly from users
the mix can’t steal from a particular
user so from the user point of view on
average they won’t suffer losses that
are more than the average rate at which
the mix steals so they don’t have to
worry that a mix might particularly have
it in an init for that particular user
and steal all of their money there’s no
way that that can happen so that’s what
I mean when I say users can test this
for themselves and finally we proposed a
cryptographic mechanism where the mix
can issue sort of a promissory
statements to the user that once it
receives a chunk at a particular address
it will send a chunk back at some other
address to the user provides and so if
the mix fails to keep this promise our
idea is that the user can publicize this
warranty and everybody will know that a
particular mix is cheated and so
everybody will stop using this mix in
the mix well
business and in combination all of these
three mechanisms provide incentives for
mixes to act honestly so these were our
calculations anyway in our proposal we
haven’t proved that this will work in
practice that remains to be seen
all right let’s say on that note let’s
quickly look at how things are in
practice right now it doesn’t seem that
there are any reputable services
providing dedicated mixing that users
have learned to trust or at least enough
to use on a regular basis in fact this
is from the Bitcoin wiki where the
original is also highlighted in red so I
took the liberty of doing that myself
mixing services made themselves be
operating with anonymity and so if your
funds are not delivered you have no
recourse use at your own discretion so
we’re proposing moving to a different
model where mixes stay in business
become reputable entities and so on that
hasn’t quite happened yet
and note that there are sort of a
bootstrapping problem here if mixes were
reputable entities they would have a big
volume of transactions and so by
interacting with them you’d get a pretty
good anonymity set and so users would be
more confidence in and interacting with
them and mixes would realize that
they’re making more money by staying in
business and taking a small cut than by
trying to steal the small amount of
money that’s they’re controlling at any
given time and so mixes would be further
incentivized to stay in business so you
can imagine that once a mix ecosystem
gets going it will be self-sustaining
but whether or not that can eventually
happen we can say for sure that it
hasn’t quite happened yet
so the fact that this make mix ecosystem
currently doesn’t exist is a big part of
the reason why many people have proposed
decentralized mixing and there are a
variety of reasons for decentralized
mixing some of which we’ve talked about
in that there is no bootstrapping
problem so the reason there’s no
bootstrapping problem is that in
decentralized mixing you don’t go
through a particular dedicated mix
service instead you find a community of
peers who all want to do mixing and
somehow without any central coordination
or at least a central service that
collects your funds you manage to mix
with each other so that avoids the
bootstrapping problem because as long as
there is enough interest from Bitcoin
users they can meet with each other and
start mixing how to do that we’ll see in
a second also theft is impossible and
this is enforced through technical means
because nobody is explicitly sending
bitcoins to another user again we’ll see
how how this is accomplished it could
possibly provide better anonymity and
we’ll look into more details on that as
well and finally I just want to point
out that this is just more
philosophically aligned with Bitcoin if
you can get rid of having to have a
centralized service for some purpose
then there are a lot of users who are
Bitcoin users who find that appealing so
how might this work the main proposal
for a decentralized mixing is called a
coin joint and this is something that
was proposed by Greg Maxwell as a core
Bitcoin developer who will meet again in
the next lecture actually so what he
proposed is different users coming
together to create a single Bitcoin
transaction and what are the outputs of
this transaction we’ll see in a second
but somehow create a single Bitcoin
transaction that combines all of their
inputs and presumably of equal value now
let’s think about this for a second what
is necessary in order for these three
users to create a single transaction
well one way of thinking about it we
might imagine that in order to produce a
signature somebody has to collect all
three private keys that’s not actually
how it works though in Bitcoin all the
signatures corresponding to the
different inputs are totally separate so
each input signature is entirely
so what it allows the users to easily do
is create different inputs that
correspond to different users and also
different output addresses that
correspond to different users and
randomize the order between them so in
this situation maybe the users
participating in the protocol might
necessarily have to know which input
address corresponds to which output
address although we’ll see in a second
if we can avoid that as well but
certainly someone looking at the
blockchain
looking at only this single transaction
even if they realize that this is a coin
joint transaction will not be able to
find the mapping between the input and
the output it’s that simple that’s the
essence of coin joint of course this is
just one round of mixing on top of this
you have to apply the same principles
that we talked about before so the
principles that I discussed they’re not
only for centralized mixes they apply
essentially with very few modifications
even to the coin joint scenario so you
want to do a sequence of coin joints you
want to make sure that these chunks
sizes are standardized so that you don’t
introduce new side channels etc etc okay
but let’s look into the single
transaction though exactly how would
this work there are a lot of details
that are still not clear so let’s look
at this in algorithmic form so if we
write it out like this what needs to
happen is that a group of peers who all
want to mix somehow need to find each
other that’s the first difficulty and
then they have to exchange their input
and output addresses with each other and
one of these users it doesn’t matter who
will construct this transaction not yet
assign transaction but just the
transaction that corresponds to these
different inputs going to these
different outputs and then they’ll pass
it around to collect signatures from
each of the peers
now if the peer who constructed the
transaction were disruptive and for
example left out one of the peers
outputs then the whole thing will
collapse because when that particular
peer gets the transaction in order to
sign it they will simply refuse to sign
and the process will not be able to go
forward but if everything is okay
axé honestly then the transaction is
constructed and now any period again
doesn’t matter who can broadcast the
transaction to the network two of them
could do it independently it doesn’t
matter the transaction will of course be
counted only once so that’s it that’s
the whole protocol the entire security
property comes from each peer checking
that their output address is represented
and that their output of course receives
at least as much value as went in from
their input so that seems simple enough
but what are the remaining problems here
well there were three problems one is
how did this group of peers find each
other right and the second is that as as
I described in the previous slide this
protocol involves each of these peers
finding out the mapping between inputs
and outputs or at least one of those
peers so that seems like a problem in
fact I want to point out that this is a
worse problem for decentralized mixes
than for centralized mixes and why is
that
in the centralized mix in case you could
hope that these different mixes are run
by entirely different entities who are
not colluding with each other and at
least in some cases these will be
reputable real-life entities who you
would imagine have incentives not to
collude with each other
because they have different goals or for
whatever reason again the reasoning is
similar to tor you have a variety of
different types of people who are
running tor nodes they don’t at all have
the same incentives so we imagine that
they’re not all going to collude with
each other and also that they’re not all
going to get compromised by the same
attacker a similar principle holds for
decentralized mixes and that only works
because you know something about the
identities of these mixes so these mixes
having known identities and being
reputable entities helps anonymity in
this case we don’t have that luxury with
decentralized mixes because we have no
idea who any of these peers are right so
it could be a single attacker creating
lots of civil accounts and accounts in
the sense of just creating lots of
civils and trying to get into every
single coin during transaction that’s
ever carried out in order to learn these
input-output mappings and so even if you
do a series of coin joints it might
be the case that in each of those corny
joints at least one of the participants
was an attacker or was controlled by the
same attacker in which case your entire
anonymity is lost so that seems like a
problem and a third problem and the kind
of a tricky one is denial of service
what does this mean well it could happen
that after providing the input-output
pairs one of the nodes disappears and
refuses to sign the resulting
transaction so the transaction is not
able to proceed forward and secondly
even after creating the signature before
the transaction can get broadcast to the
network and confirmed in the blockchain
one of the nodes who might be malicious
might take this input and spend it in
some other transaction that’s unrelated
to this coin chain and so this coin join
will look like a double spend attempt
and will be rejected by the Bitcoin
network so that’s another way in which
you can launch denial of service against
coin join so now let’s look at what are
some possible solutions to each of these
three problems well the first one how to
find peers is as a very simple solution
it’s not it’s not a perfect solution but
people consider this to be someone okay
you simply use an untrusted server it’s
sort of like a watering hole where a
different users can connect and find
each other but the server is not
necessarily involved in any way that the
users have to trust in running the
protocol all right and as we’re going to
see each of these steps for solving
these problems introduce us a little bit
of engineering complexity so this
already requires a whole peer-to-peer
protocol for finding these coins owned
peers on top of the Bitcoin protocol and
we’re going to see similar factors that
introduce engineering complexity for
solving each of the other problems so
the next one how do we solve the
anonymity problem well there’s there’s a
simple straw man solution you can frame
the anonymity problem in this way you
need to communicate the set of inputs to
all the peers and also you need to
communicate the set of outputs but break
the linkage between the input and the
output now this becomes a communications
anonymity problem instead of a Bitcoin
anonymity problem right
because it’s simply the matter of
communicating these output addresses
that needs to be unlinked from
communication of the infant addresses so
a strawman solution to that since we
already have seen tore a little bit is
simply this these peers come together
they exchange input addresses and they
disconnect and then reconnect over tor
after after a while and then exchange
the output addresses so this is pretty
simple but it may not be very robust in
practice a better solution might be to
build a special-purpose
anonymous routing mechanism for these
participants to utilize just for this
protocol and there are things called
decryption mix nets that allow you to do
exactly that and such solutions have
been proposed so let’s move to the third
problem which is a denial of service
attack let’s think about it this way
what’s a traditional solution to a
denial of service attack well one
possible solution to a denial of service
is to make it a little bit expensive for
the client to connect to the server and
to to receive service well this is not a
client-server model it’s a peer-to-peer
model but we can still try to adopt the
same principles and that’s the principle
behind the first two of the proposed
solutions for denial of service either a
proof of work or a proof of burn so what
do I mean by this proof of work is
simply repurposing the algorithm behind
bitcoins proof of work to require each
of these peer nodes to do a little bit
of computational work before they can
join a coin joint protocol and the
rationale is that if the adversary is
going to disrupt every coin joint that
exists out there they’re going to be
burning a lot of computing power which
will make it very expensive for them
proof of burn is a similar concept it’s
a it’s also called that fidelity bonds
in Bitcoin
it allows you to irreversibly destroy
some bitcoins that you own by sending it
to a nun spendable address thereby
proving that you’ve made a little bit of
an expensive signal in order to get into
this system so that’s the rationale
between the first two solutions the
second two solutions next to the third
and fourth also have a similar rationale
which is to identify the malicious
participant one or more malicious
participants who launch the denial of
service to kick them out and to run the
coin join with the remaining
participants and that could be done if
you trust the server a little bit to
carry it out it could also be done in a
purely decentralized manner like this
paper called coin shuffle proposed
and they came up with a cryptographic
blaming protocol for doing this and it
involves something called zero knowledge
where you learn at least one of the
players who misbehaved without
necessarily learning much more about
what happened and then the rest of the
peers can then redo the protocol at
various points I’ve talked about
side-channel so let’s look at an example
of that and I want to point out that
these side channels can be very tricky
not all the mixing in the world can save
you from what I call high-level flows
that could be identifying and here’s a
neat example of this let’s say user
Alice receives a very specific amount of
bitcoins let’s say on a weekly basis as
income and has the habit of always
automatically and immediately
transferring let’s say five percent of
that to her retirement amount right so
think about the patterns that will be
visible on the blockchain here no matter
what she does to secure the link between
the addresses at which she receives her
income and the address to which she
transfers to her retirement account the
patterns here are going to be uniquely
identifying because this is a very
specific value and the 5% of that is
also going to be a specific value and
there’s also a timing pattern every time
money appears here every time money goes
to this address as well so this is a
problem how do we protect ourselves from
this well one suggestion that has been
proposed is not only in the context of
mixing but even in the context of
regular Bitcoin wallets where users are
not even not trying to do any mixing is
by Mike Hearn and he calls does merge
avoidance merge avoidance is a very
simple idea when users want to do
payments the proposal is that instead of
creating a giant transaction that
combines as many inputs as necessary in
to pay the entire payment to a single
address why not have a protocol by which
the receiver can provide multiple output
addresses as many as necessary and the
sender and receiver can agree upon
denominations and the sender can avoid
combining different inputs and can make
a variety of different transactions that
send money from a different end for
addresses to different output addresses
so this avoids a lot of the problems
both of high-level flows because even
these multiple input and output
addresses cannot be linked to each other
so an adversary might not even be able
to observe the fact that this is a
high-level flow that this is that’s
happening but also avoids problems like
clustering addresses together because of
evidence of shared spending and this is
a proposal that one could think about
incorporating right now into Bitcoin
based payment flows in order to improve
anonymity for everyone
now let’s turn to 0 coin and zero cash
which are a completely different
approach to Bitcoin anonymity the
approach is sort of to bake it in at the
protocol level and these are
cryptographic heavyweights and so as
your coin was first developed by
cryptographers at Johns Hopkins and
later on the code started collaborating
with other researchers around the world
who had been developing in a very
efficient cryptographic technique that
would enable making some of the
cryptographic operations and zero coin
more efficient and that resulted in zero
cash as you’ll see these techniques
provide a qualitatively different level
of anonymity than mixing solutions that
sit on top of Bitcoin but what’s the
catch the problem is that this is not
quite backward compatible with Bitcoin 0
coin and zero cash are going to require
alt coins technically it’s possible that
0 coin can be deployed as what is known
as a soft work of Bitcoin but the
practical difficulties are high enough
that this is not really considered
feasible and in fact the 0 coin
developers intend to deploy it as an
altcoin themselves instead of trying to
be compatible with Bitcoin directly
let’s start talking about the details
here let’s review some of the things
that I’ve just said so 0 coin brings
protocol level mixing and a being baked
into the protocol what it gives you is a
cryptographic guarantee of mixing what
does that mean you don’t need to trust a
single mix or even a set of mixes or a
set of peers or anybody at all to ensure
your anonymity you just need to rely on
the underlying crypto being solid you
don’t even need to rely on the miners
enforcing this in order to achieve
anonymity it’s purely a cryptographic
guarantee so that’s really great that’s
qualitatively better than what we have
so far and of course it’s not currently
compatible with Bitcoin and here’s the
paper if you want to look it up so how
does 0 coin worth I’m going to introduce
a concept called base coin and I’m
taking a few liberties with the
presentation here in order to simplify
and clarify the concepts I’m going to do
that by mixing some concepts from 0 coin
and 0 cash but toward the end I’ll make
very clear what the differences are
between the two so like I said 0 coin is
an altcoin
and I’m going to call that altcoin base
coin I’m not calling it zero coin
because zero coin is something else it’s
an extension of this base coin it’s
something that sort of sits on top of
this old coin and the key property that
gives you anonymity is that these base
coins can be converted into zero coins
and back again and when you do that it
breaks the link between the original
base coin and the new base coin so think
of this as a cryptographic mixing system
that’s provided by the protocol itself
so how might this work another way of
looking at a zero point is that it’s a
cryptographic proof that you owned a
base coin not anymore but you owned it
and then you made it unspent Abel zero
coin is something that allows you to
assert that to say any miner who might
care and miners can verify these proofs
and that’s what gives you the rights to
later redeem a new base coin in exchange
for the zero coin and the analogy is a
little bit like poker chips so how could
that work and what properties do these
proofs need to have in order to enable
this so one challenge is how to
construct these proofs and the other
trick is how do you make sure that each
proof can be spent only once can be used
only once to redeem a base coin because
if you don’t have that property then
it’s going to lead to double spending so
let’s see how to do that it crucially
involves a concept called zero knowledge
proof what are zero knowledge proof I’m
going to tell you it’ll have a little
bit of an intuitive level so I’m calling
it crypto magic again but what it is is
it’s a way for somebody to prove a
statement without revealing any other
information and that leads to that
statement being true a couple of
examples are going to make this really
clear you might be able to prove a
statement like I know an input that
hashes to this particular value and
notice that if the input that he had
picks were long and random you could if
you did a proof in such a way that you
don’t actually reveal the input it won’t
necessarily allow somebody else to infer
what that input is a more complex
version of this is you could say I know
an input that has
to some hash and a following sets of
several different possible outputs and
the zero knowledge proof that zero point
is going to use as something that’s very
similar to the second category here
let’s dive in a little bit more so zero
coins are minted they come into
existence by minting and anybody can do
this and zero coins come in standard
denominations let’s assume for the rest
of this that zero coins are worth one
base going each you could also imagine
multiple denominations coexisting how do
you make a zero point well we’re gonna
see that in the next slide but let me
just say for now that minting a zero
coin doesn’t automatically give it any
value you can get free money
it only acquires value once you put it
onto the blockchain and so putting it
onto the blockchain is going to be about
as expensive as the value of that zero
coin now that you’re later going to be
able to redeem so you have some sort of
a conservation principle here okay so
here’s how specifically in cryptographic
terms we meant a zero coin
it’s something called a cryptographic
commitment what a cryptographic
commitment is is intuitively you can
think of it as you’re taking a serial
number a random serial number that you
generated and putting it into an
envelope so this intuitive notion of
putting it into an envelope
cryptographically what does that
correspond to what it corresponds to is
generating another random secret R which
you’re never going to make public and
computing the hash of the coin serial
number together with this random secret
now this is a little bit of a
simplification but it but it really
helps you understand the properties of
the system so let’s go with this
description so what just happened here
you generated arbitrarily just like you
generated Bitcoin public keys a serial
number for your zero coin and if we’re
long and random hopefully no one else
has ever picked that same serial number
before and you also generated this other
random number that you’re going to keep
secret and intuitively generating a
commitment to the serial number
corresponds to putting it
envelope and sealing it and
mathematically it happens by computing
the hash of the serial number together
with this random value okay
once you’ve generated this commitment
what do you do with that well the next
step is to put that commitment on to the
blockchain that’s when the zero coin
sort of becomes real and doing this
requires an descends burning a base coin
and making it unspent Abel so in
concrete terms how would that work
you’ve got the blockchain over here and
one of those transactions might be a
mint transaction and if you zoomed in it
would be a transaction that’s signed by
Alice who created this zero coin who
minted the zero coin and what we saw
earlier in the structure of transactions
is that over here you would have the
recipients public key or the recipients
atmos instead of that here you have this
cryptographic commitment and just like
before just like a transaction having a
pointer to a previous transaction the
same structure is carried over for Xero
coin transactions as well so what has
happens here we’ve spent the space coin
in order to mince the zero coin and this
commitment the sealed envelope that
we’ve put into the zero coin as what is
going to allow us to redeem that zero
coin later in exchange for a base coin
once again so how does that work to
spend the zero coin later you will
reveal that serial number that you put
inside the envelope and what miners will
do it’s their their job to verify that
the serial number has not been spent
before that the serial number has not
been revealed as a number that was put
inside some other envelope
that’s what prevents double spending in
the system next you’ll create a zero
knowledge proof that we just talked
about and specifically the zero
knowledge proof will say I know a number
R such that the hash of the serial
number together with our corresponds to
one of the zero coins of the blockchain
and we’ll make that statement more
mathematically precise in a second but
think about what this says it doesn’t
reveal that random number R but somehow
you’re proving that
you are in possession of that number
combined with the serial number that you
have just made public will result in the
zero coin that was once in the past put
onto the blockchain right so for
somebody looking at this proof this is
all they need to know to verify that you
earlier spent a base coin in order to
get to this point so this now should
give you the right to redeem a base coin
but which base coin and here’s where the
anonymity property comes in you can pick
an arbitrary zero coin in the blockchain
and use that as an input to a new
transaction out of which comes a base
coin and the miners will allow you to do
that so put a zero coin in take a zero
point out but a different zero coin and
all that anybody needs to know is that
you have the right to do that because
you put in some zero coin in the past it
doesn’t matter which zero coin and you
can’t do that twice you can’t do I spend
twice corresponding to a single mint
because the serial number now will
become public and there’s only one
serial number corresponding to one zero
coin and you only know the serial
numbers corresponding to Euro zero coins
and not anyone elses zero coins quick so
where does the anonymity property comes
from here’s the anonymity property since
you’ve kept this random number our
secrets and this is what is available on
the blockchain there are a number of
hashes or commitments corresponding to
the different zero coins that have been
put on the blockchain even though you’ve
revealed the serial number not knowing
this other random input are nobody can
try to brute-force this and guess which
of these zero coins correspondent to
your serial number so even after the
serial number inside an envelope has
been revealed and it’s been verified
that this serial number was inside one
of the envelopes we still don’t know
which serial number it is so this is the
sort of magical property that
zero-knowledge troops in cryptography
give us that you wouldn’t get in a real
world physical world envelope based
analogy of this so the next cool thing
about this whole construction is the
fact that these proofs are efficient and
in putting efficient in quotes here
and the sense in which they’re efficient
is that compared to what we know of zero
knowledge person have come to expect on
them it’s quite an achievement that
these proofs are as efficient as they
are however compared to the efficiency
of Bitcoin transactions themselves these
are in fact quite slow so it occupies a
space in between those two so exactly
what I mean by efficient the reason it’s
efficient is that it manages to avoid
being linear in the number of zero coins
on the chain even though that is what
you would expect why is that what you
would expect think about the statement
that the spender is proving here I know
a random number R such that either the
hash of the serial number with r
corresponds to the first commitment or
the first hash or the second commitment
or any one of these giant number of
commitments that reside on the
blockchain right so it’s a very long
statement that the prover is proving
it’s a statement whose length is
proportional to the number of zero coins
on the blockchain and yet the proof is
much smaller than that it’s not linear
it’s only logarithmic in this in the and
the value in here and that’s part of the
magic of zero coin that’s what makes it
possible to even run the system all
right moving on let’s talk about zero
cash now a zero cash kind of takes the
cryptography sort of to the next level
it uses a cryptographic tool called
snarks which we won’t get into it all
but the upshot of that the upshot of the
use of these more efficient
cryptographic constructions for proofs
is that the efficiency gets to a point
where the authors suggest that you can
in fact run the whole system without
having any base coin all transactions
can be done in the zero-knowledge manner
you don’t need to have separate
expensive transactions that are used
only for mixing and a set of regular
everyday transactions that you use when
you don’t want special anonymity
properties that distinction is now gone
the claim is that you can run all of
these transactions sort of inside these
envelopes and what I mean by that is the
following all transactions are
zero points and so zero cash becomes
untraceable in a sense because there is
no base coin and the reason for that is
that splitting and merging of coins are
also transactions that are supported in
zero cash itself without going to base
going and in particular the transaction
value is the transaction amounts you can
put those inside the commitments those
won’t be visible on the blockchain
anymore the only thing that the ledger
records publicly is the existence of
these transactions you know that Alice
put in some transaction you know much
later that Bob redeem some transaction
who might be the same user might be a
different user but the only people who
need to know what the amount is are the
sender and receiver of any particular
transaction the miners don’t need to
know that if there is a transaction fee
then the miners need to know that fee
but that doesn’t really compromise your
anonymity property right so the ability
to run zero coin in this different
configuration where it’s not two
different coins anymore it’s not a base
coin with a mix layer on top but instead
an entirely untraceable system of
transactions puts zero cash certif in
the next level when it comes to
anonymity because a lot of the possible
side channel attacks that were true for
mixing that were true to a certain
extent at least four zero coin are no
longer true for zero cash because the
transaction amounts will no longer be
visible in the public ledger but that
almost sounds too good to be true a
completely untraceable electronic cash
system it is ledger based but the ledger
doesn’t record anything that might
compromise anonymity or privacy well
there is one catch here’s the catch in
zero cash it requires a certain setup
process to even set up the system
specifically one needs random and secret
inputs in order to generate the public
parameters think of those as public keys
except that these are giant public keys
they’re over a gigabyte in size and not
only that not only is the size a bit of
a problem these secret inputs for the
security of the system then have to be
securely destroyed so that nobody knows
what those secret inputs were that were
used in order to generate these public
parameters that seems like a bit of a
problem
and the reason that no one can know them
is because if somebody knows them it
doesn’t mean that they will be able to
compromise anonymity but they will be
able to create new zero coins for
themselves and nobody will be the wiser
which is also an equally bad problem for
the currency so it’s kind of an
interesting sociological problem here
how could some entity set up the system
and then convince everybody that they
have securely destroyed the parameters
that were of course necessary in order
to set up the system so it’s not
entirely clear how that can be solved
there have been various proposals for it
but at the moment we don’t have a very
clear idea of how to go forward on this
so what have we seen so far in all of
the different efforts to improve
anonymity and Bitcoin well if we put
them on a line as I’ll show you in a
second we see that there are five
clearly different levels of anonymity
that that we’ve seen in different
proposed solutions and what are these so
let’s look at not only the levels of
anonymity that these systems provide but
also the deployability of these systems
let’s start with Bitcoin which is
already here it’s only pseudonymous it
doesn’t even aspire to be really
Anonymous and we’ve seen that pretty bad
transaction graph analysis are possible
I showed you many beautiful graphs with
the clustering of different addresses
and in many cases how to go from those
addresses to identities so not a lot of
anonymity provided by Bitcoin the next
level is simply using a single mix sort
of in a manual way in which people are
doing right now with some of these
dedicated mixed services and that still
allows you transaction graph analysis
because as you might remember from the
four principles that I gave you if you
don’t have this automated system that
has uniform chunk sizes and so on a lot
of transaction graph analysis is still
possible and in addition you have to
worry that this mix might not be
trustworthy historian records and might
be sharing them with other people and
again could get hacked etc the third
level that we saw as a chain of mixes
and this can be in a centralized model
or a decentralized model it doesn’t
matter both models give you roughly the
same level of anonymity but we’re really
the anonymity improvement comes in for
this one compared to a single mix
is that you have these standardized
chunk sizes and you have a series of
mixes and you have variety of other
bells and whistles on top of it like
automated clients and so on and for this
some side channels are still possible
not as bad as before transaction graph
analysis is no longer that easy and you
still have to worry about an adversary
who might collude with multiple mixes or
in the decentralized model has some
peers that might be malicious and
compromise your anonymity this is of
course perfectly backward compatible
with Bitcoin could be deployed and
adopted any day hasn’t quite happened
yet in a way that we would consider to
be truly anonymous and then we saw zero
coin which is cryptographic mixing baked
into the protocol doesn’t depend on
anybody
promising to destroy their records or
anything like that you just need to
trust the math so that’s a whole
different level of anonymity in my
opinion it still has some possible side
channels but it’s not as bad as the
other mixing based solutions that we saw
where it’s not baked into the protocol
and Xero coin of course as we saw as an
altcoin so it’s not quite Bitcoin
compatible in a way that’s one might
hope and finally zero cash the
difference between zero cash and zero
coin is not so much at a fundamental
mathematical level but because of the
fact that you can run zero cash in a
configuration where you get rid of the
base coin altogether and the efficiency
is not is not too bad and that in that
configuration and so what that gives you
is untraceable ax T which is something
on top of unlink ability so that’s a new
anonymity property and there really
aren’t any anonymity attacks that I can
think of at lease but the downside of
course is that not only is an altcoin
but it also has this very tricky set up
process that we don’t necessarily know
how to make progress on
so we’ve talked a lot about bitcoins
anonymity in this lecture but bitcoins
anonymity becomes even more powerful
when combined with other technologies in
particular anonymous communication
technologies we’ve talked about Tor a
little bit we’ve alluded to it several
times but now let’s go into more detail
let’s first set up the problem of
anonymous communication though so this
is what the system looks like there are
a bunch of senders there are a bunch of
recipients and messages are routed from
senders through recipients through this
network over here and of course there’s
going to be an attacker this attacker
and this is called the threat model the
attacker controls several things some of
these nodes in red are compromised by
the attacker some of these edges some of
these links between honest nodes to the
network are also controlled by the
attacker even if the nodes themselves
are not similarly some of the recipient
nodes over here and some of these links
from the network to the recipient node
are also controlled by the attacker and
finally some of the internal nodes of
the anonymous communication network all
under the control of the attacker but
crucially not all of the communication
network is controlled by the attacker
and we want to achieve anonymity in this
hostile environment and as before
anonymity refers to unlink ability
between the sender and the receiver so
how does Tor accomplish this it’s the
same old pattern of picking a chain of
intermediaries to route your messages
through and here it is in a nice visual
form and I have to thank the Electronic
Frontier Foundation for this lie so
what’s going on Alice over here wants to
talk to Bob over here so she pre selects
a path through this set of routers and
that number is fixed in the Tor protocol
that’s always 3 but conceptually you can
imagine that it would be any number you
want and the more nodes you read through
the more anonymity you get or the harder
it is I should say to breach anonymity
so these nodes denoted with a plus or
all the tor nodes and she picks some
subset of 3 nodes randomly in order to
route her message and the security
property that we get is that as long as
at least one of these three nodes that
she picks is not compromised or
colluding with the attacker then she is
a sewer
safe here in that Alice cannot be linked
to Bob by somebody who’s observing some
of the nodes in the network I should say
that there are many attacks possible on
tour one of them for example is called
an end end traffic correlation attack so
they’re going to be timing patterns in
the flow of traffic between Alice and
whatever Baba’s maybe a web site and so
if the attacker controls both of these
links then just by observing the
correlation in those timing patterns he
might be able to determine that these
two nodes are in communication with each
other even if he knows nothing about the
route that the message took between them
so one key point here is how do you hide
routing information what do I mean by
that when a message is gone from Alice
to the first router it has to have the
IP address of Bob’s computer somewhere
in that message otherwise there is no
way that this router can appropriately
forward that on to reach the right
destination however we don’t want this
router to actually learn that IP address
because if the router does learn that IP
address that it knows both alice’s IP
because the message came from her and a
Bob’s IP because that’s where the
message is eventually going and now this
router has the link between the two ends
of the communication and this would be a
problem if this router were malicious so
as you might guess the answer involves
encryption and as you can see in this
picture these links are in green they’re
encrypted connections and this one is an
unencrypted connection let’s look at
more detail to see how this encryption
works it’s a specific way in which
encryption is used it’s called a layered
encryption it resembles an onion so
that’s why Onion Routing is a related
concept here so what is going on here
alice and router 1 share a symmetric key
that’s represented in purple allison
router to share this key that’s
represented in blue and allison are
three share the key that’s represented
in gold now these symmetric keys are not
stored long term by any of these nodes
they’re established as necessary using
key exchange the only persistent keys
are the long term public keys of these
routers and these routers do in fact
have long-lived identities and public
keys and so on Alice of course does not
need to
have any long-term public key when she
picks a path of these rotors she finds
their public keys execute ski exchange
protocols and obtains these shared
symmetric keys and what she’s gonna do
is when she sends the message to our one
it’s going to be Triple E encrypted the
outermost layer of encryption is a
symmetric encryption between Alice and
r1 and so what this allows r1 to do is
peel off that layer of encryption like
peeling off an onion and when Ryder one
peels off that layer of encryption
inside it’s going to find the IP address
of router two and an encrypted message
to send to router 2 and it’s going
forward that router 2 peels up a further
layer of encryption and then to router 3
further layer of encryption now the
message is unencrypted consisting of the
plaintext message as well as Bob’s IP
address answer router 3 now sends that
message in plain texts to Bob of course
what you probably want to do is further
layer a protocol like HTTP or a secure
web browsing on top of tor so that even
this message from router 3 to a to Bob
is encrypted but the Tor protocol itself
doesn’t guarantee that has no way of
guaranteeing that because Bob might be a
regular web server that doesn’t even
speak the Tor protocol and so there’s no
way the Tor can be responsible for the
encryption between our three which is
called the exit node and the ultimate
recipient of the message I’ll leave you
to think about why this wouldn’t quite
work if there were only one layer of
encryption for example if Alice tried to
encrypt the message all the way from her
to r3 it wouldn’t quite work the writing
I would not quite work out but as it is
the very neat property that you have is
that r1 only knows Alice’s IP address
and Artoo’s address does not know our 3s
or Bob’s address and similarly every
node knows only the addresses of the
node that was one hop before it and one
hop after it and in fact when the
message gets to this point the IP
address of alice is not even present
anymore whether or not an encrypted form
so that’s really how you get anonymity
here if any one of these if our two for
example were compromised than it would
learn our ones and our and threes IP
addresses but not Alice’s
Bob’s so that’s how tor works and now
let’s talk about a Silk Road and in
particular the problem that a site like
Silk Road has to overcome as this Silk
Road is what is known as a hidden
service in other words the Silk Road
server wants to hide his address for
obvious reasons if you haven’t heard
about Silk Road let me just say a
sentence about it briefly you’re gonna
see it in more detail in next lecture a
Silk Road was a website that operated
for a couple of years it was an
anonymous marketplace it sold a variety
of goods with the thing that was most
known for is selling drugs and because
of the pervasive anonymity or at least
pseudonymity in the system the idea was
that it was very hard for law
enforcement to go after and the story of
what happened next I will leave to the
next lecture but let’s look look at the
technology that made something like Silk
Road possible and the implications of
that so here is a simplified algorithm
by which a server can keep its identity
hidden and yet provide services through
tor what it does is it connects through
what is called a rendezvous point which
is one of the Tor routers through tor
and then it what it’s going to do is
it’s going to publish the mapping
between its name its domain name and the
address of the rendezvous points through
directory services that the torah’
system offers and these domain names are
not your regular DNS domain names that
wouldn’t work because it’s this whole
parallel system of routing and so these
are called onion addresses and they’re
going to look like this long string dot
onion and notice that it looks a lot
like Bitcoin public keys and it’s sort
of the same reasons it’s because anyone
can generate one of these and now the
client will have to learn the onion
address of the site that it wants to
visit if when this when the Silk Road
existed if you wanted to go to Silk Road
you couldn’t type in Silk Road com that
wouldn’t make any sense because Silk
Road is not even available over at the
regular web instead you would have to
through some manner and this was a
widely known address you would have to
find this net Silk Roads address by the
way this is the onion address of duck
taco a search engine that offers privacy
and anonymity
but you would find a similar address
that belonged to Silk Road and put that
into your tour enabled browser and that
what your client would automatically do
is look up the mapping for the address
of the rendezvous point connect to that
rendezvous point and through that
rendezvous point have a anonymous and
encrypted connection to the ultimate
server without the server having to
publish its actual IP address so that
covers some of the technology behind
Silk Road in particular anonymous
communication and how do you do
anonymous payments which is of course
with Bitcoin but still you need more
technology in order to make this whole
system work you need security in other
words how can you be sure that when you
pay someone on Silk Road they’re going
to actually sell you the goods Silk Road
had a reputation system for that and how
do you do anonymous shipping I decide
pretty much left us to the participants
advised buyers to provide an anonymous
Pio box for example to ship goods do so
let’s take a step back we’ve covered a
lot of Technology in this lecture
hopefully you’ve understood that Bitcoin
anonymity is a very powerful thing and
it gains in power when combined with
other technologies in particular
anonymous communication technologies and
also anonymity is a deeply morally
ambiguous thing there are many moral
distinctions that we would like to make
that we’re not able to adequately
express at the technological level and
so some of this moral ambiguity appears
to be inherent hopefully it’s also been
clear that anonymity is very fragile one
mistake can can create a link that
you’re trying to hide but also anonymity
is an important thing to protect it’s
worthwhile protecting it has a lot of
good users in addition to bad users so
most of the things that we’ve talked
about today are either at the forefront
of research technologically or there’s a
topic of serious ethical debates none of
this is really settled and so this is an
ongoing conversation area of ongoing
research we don’t know which anonymity
system for a Bitcoin if any is going to
become prominent or mainstream and so
this is a great opportunity for you
either as a developer or in thinking
through the ethical implications to get
involved in some of these issues and
hopefully what you’ve learned in this
lecture
the right background for that
you

Add a Comment

Your email address will not be published. Required fields are marked *