[David]: I should start by thanking Cliff
Lynch for inviting me back even though I’m
retired, and for letting me debug this talk
at Berkeley’s Information Access Seminar.
I plan to talk for 20 minutes and leave plenty
of time for questions.
A lot of information will be coming at you
fast.
Afterwards, I encourage you to consult the
whole text of my talk and much additional
material on my blog.
Follow the links to the sources to get the
details that you probably missed.
We’re in a period when blockchain or distributed
ledger technology is the solution to everything,
so it’s inevitable that it will be proposed
as a solution to the problems of academic
communication and digital preservation.
In the second of a three-part series, Ian
Mulvaney has a comprehensive review of the
suggested applications of blockchain for academic
communication in three broad classes.
Those are priority claims, access to resources,
and rights.
Mulvaney discusses each of them in some detail
and doesn’t find a strong case for any of
them.
In a third part, he looks at some of the implementation
efforts currently underway and divides their
motivations into two groups.
I quote, “The first comes from commercial
interests where management of rights, IP and
ownership is complex, hard to do, and has
led to unusable systems that are driving researchers
to sites like Sci-Hub, scaring the bejesus
out of publishers in the process.
The other trend is for a desire to move to
a decentralized web and a decentralized system
of validation and reward, in a way trying
to move even further away from the control
of publishers.
It is absolutely fascinating to me that two
diametrically opposite philosophical sides
are converging on the same technology as the
answer to their problems.
Could this technology perhaps be just holding
up an unproven and untrustworthy mirror to
our desires, rather than providing any real
viable solutions?”
This talk answers Mulvaney’s question in the
affirmative.
I’ve been writing skeptically about cryptocurrencies
and blockchain technology for more than five
years.
What are my qualifications for such a long
history of pontification?
More than fifteen years ago, nearly five years
before Satoshi Nakamoto published the Bitcoin
protocol, a cryptocurrency based on a decentralized
consensus mechanism using proof-of-work, my
co-authors and I won a “best paper” award
at the prestigious SOSP workshop for a decentralized
consensus mechanism using proof-of-work.
It’s the protocol underlying the LOCKSS system.
The originality of our work didn’t lie in
decentralization, distributed consensus, or
proof-of-work.
All of these were part of the nearly three
decades of research and implementation leading
up to the Bitcoin protocol, as described by
Arvind Narayanan and Jeremy Clark in a paper
called Bitcoin’s Academic Pedigree.
Our work was original only in its application
of these techniques to statistical fault tolerance;
Nakamoto’s only in its application of them
to preventing double-spending in cryptocurrencies.
We’re going to walk through the design of
a system to perform some function, say monetary
transactions, storing files, recording reviewers’
contributions to academic communication, verifying
archival content, whatever.
Being of a naturally suspicious turn of mind,
you don’t want to trust any single central
entity, but instead want a decentralized system.
You place your trust in the consensus of a
large number of entities, which will in effect
vote on the state transitions of your system
– the transactions, reviews, archival content,
whatever.
You hope that the good entities will out-vote
the bad entities.
In the jargon, the system is trustless, which
is a misnomer.
Techniques using multiple voters to maintain
the state of a system in the presence of unreliable
and malign voters were first published in
The Byzantine Generals Problem by Lamport
and coauthors in 1982.
Alas, Byzantine Fault Tolerance (BFT) requires
a central authority to authorize entities
to take part.
In the blockchain jargon, it is permissioned.
You would rather let anyone interested take
part, a permissionless system with no central
control.
The security of your permissionless system
depends upon the assumption of uncoordinated
choice, the idea that each voter acts independently
upon its own view of the system’s state.
If anyone can take part, your system is vulnerable
to Sybil attacks, in which an attacker creates
many apparently independent voters who are
actually under his sole control.
If creating and maintaining a voter is free,
anyone can win any vote they choose simply
by creating enough Sybil voters.
So creating and maintaining a voter has to
be expensive.
Permissionless systems can defend against
Sybil attacks by requiring a vote to be accompanied
by a proof of the expenditure of some resource.
This is where proof-of-work comes in; a concept
originated by Cynthia Dwork and Moni Naor
in 1992.
To vote in a proof-of-work blockchain such
as Bitcoin’s or Ethereum’s requires computing
very many otherwise useless hashes.
The idea is that the good voters will spend
more, compute more useless hashes, than the
bad voters.
Brunnermeir and Abadi’s Blockchain Trilemma
shows that a blockchain has to choose at most
two of the following three attributes, that’s
correctness, decentralization, and cost-efficiency.
Obviously, your system needs the first two,
so the third has to go.
Running a voter – mining, in the jargon – in
your system has to be expensive if the system
is to be secure.
No one will do it unless they are rewarded.
They can’t be rewarded in fiat currency, because
that would need some central mechanism for
paying them.
So the reward has to come in the form of coins
generated by the system itself, a cryptocurrency.
To scale, permissionless systems need to be
based on a cryptocurrency.
The system’s state transitions will need to
include cryptocurrency transactions in addition
to records of files, reviews, archival content,
whatever.
Your system needs names for the parties to
these transactions.
There is no central authority handing out
names, so the parties need to name themselves.
As proposed by David Chaum in 1981, they can
do so by generating a public-private key pair,
and using the public key as the name for the
source or sink of each transaction.
In practice, this is implemented in wallet
software, which stores one or more key pairs
for use in transactions.
The public half of the pair is a pseudonym.
Unmasking the person behind the pseudonym
turns out to be fairly easy in practice.
The security of the system depends upon the
user and the software keeping the private
key secret.
This can be difficult, as Nicholas Weaver’s
computer security group at Berkeley discovered
when their wallet was compromised and their
Bitcoins were stolen.
The capital and operational costs of running
a miner include buying hardware, power, network
bandwidth, staff time, etc.
Bitcoin’s volatile price, high transaction
fees, low transaction throughput, and large
proportion of failed transactions mean that
almost no legal merchants accept payment in
Bitcoin or any other cryptocurrency.
Thus one essential part of your system is
one or more exchanges, at which the miners
can sell their cryptocurrency rewards for
the fiat currency they need to pay their bills.
Who is on the other side of those trades?
The answer has to be speculators, betting
that the price of the cryptocurrency will
increase.
Thus a second essential part of your system
is a general belief in the inevitable rise
in price of the coins by which the miners
are rewarded.
If miners believe that the price will go down,
they will sell their rewards immediately,
a self-fulfilling prophesy.
Permissionless blockchains require an inflow
of speculative funds at an average rate greater
than the current rate of mining rewards if
the price is not to collapse.
To maintain Bitcoin’s price at $4,000 – it’s
currently about 3,500 – requires an inflow
of $300,000 an hour.
In order to spend enough to be secure, say
$300,000 an hour, you need a lot of miners.
It turns out that a third essential part of
your system is a small number of mining pools.
Bitcoin has the equivalent of around 3 million
Antminer S9 chips, and a block time of ten
minutes.
Each S9, costing maybe $1,000, can expect
a reward about once every 60 years.
It will be obsolete in about a year, so only
one in 60 will ever earn anything.
To smooth out their income, miners join pools,
contributing their mining power and receiving
the corresponding fraction of the rewards
earned by the pool.
These pools have strong economies of scale,
so successful cryptocurrencies end up with
a majority of their mining power in three
to four pools.
Each of these big pools can expect a reward
about every hour or so.
These blockchains aren’t decentralized,
but centralized around a few large pools.
At multiple times in 2014, one mining pool
controlled more than 51% of the Bitcoin mining
power.
At almost all times since, three to four pools
have controlled the majority of the Bitcoin
mining power.
Currently two of them are controlled by Bitmain,
the dominant supplier of mining ASICs.
With the advent of mining-as-a-service, 51%
attacks have become endemic among the smaller
alt-coins.
The security of a blockchain depends upon
the assumption that these few pools are not
conspiring together outside the blockchain,
an assumption that is impossible to verify
in the real world, and, by Murphy’s Law, is
therefore false.
Similar off-chain collusion among cryptocurrency
traders allows for extremely profitable pump-and-dump
schemes.
In practice, the security of a blockchain
depends not merely on the security of the
protocol itself, but on the security of the
core software and the wallets and exchanges
used to store and trade its cryptocurrency.
This ancillary software has bugs, such as
the recently revealed major vulnerability
in Bitcoin Core, the Parity Wallet fiasco,
and the routine heists using vulnerabilities
in exchange software.
Recent game-theoretic analysis suggests that
there are strong economic limits to the security
of cryptocurrency-based blockchains.
For safety, the total value of transactions
in a block needs to be less than the value
of the block reward.
Your system needs an append-only data structure
to which records of the transactions, files,
reviews, archival content, whatever are appended.
It would be bad if the miners could vote to
re-write history, undoing these records.
In the jargon, the system needs to be immutable,
which is another misnomer.
The necessary data structure for this purpose
was published by Stuart Haber and W. Scott
Stornetta in 1991.
A company using their technique has been providing
a centralized service of securely time-stamping
documents, effectively a blockchain, for nearly
a quarter of a century.
It is a form of Merkle, or hash tree, published
by Ralph Merkle in 1980.
For blockchains, it’s a linear chain to which
fixed-size blocks are added at regular intervals.
Each block contains the hash of its predecessor,
so it’s a chain of blocks.
The blockchain is mutable, it’s just rather
hard to mutate it without being detected,
because of the Merkle tree’s hashes, and
easy to recover, because there are Lots Of
Copies Keeping Stuff Safe.
But this is a double-edged sword.
Immutability makes systems incompatible with
the GDPR, and immutable systems to which anyone
can post information will be suppressed by
governments.
A user of your system wanting to perform a
transaction, store a file, record a review,
whatever, needs to persuade miners to include
their transaction in a block.
Miners are coin-operated; you need to pay
them to do so.
How much do you need to pay them?
That question reveals another economic problem,
fixed supply and variable demand, which equals
variable price.
Each block is, in effect, a blind auction
among the pending transactions.
So let’s talk about CryptoKitties, a game
that bought the Ethereum blockchain to its
knees, despite the bold claims that it could
handle unlimited decentralized applications.
How many users did it take to cripple the
network?
It was far fewer than non-blockchain apps
can handle with ease.
CryptoKitties peaked at about 14,000 users.
NeoPets, a similar centralized game, peaked
at about 2,500 times as many.
CryptoKitties’ average price per transaction
spiked 465% between November 28 and December
12 as the game got popular, a major reason
why it stopped being popular.
The same phenomenon happened during Bitcoin’s
price spike around the same time.
Cryptocurrency transactions are affordable
only if no one wants to transact.
When everyone does, they immediately become
unaffordable.
Nakamoto’s Bitcoin blockchain was designed
only to support recording transactions.
It can be abused for other purposes, such
as storing illegal content.
But it is likely that you need additional
functionality, which is where Ethereum’s smart
contracts come in.
These are fully functional programs, written
in a JavaScript-like language that are embedded
in Ethereum’s blockchain.
They are mainly used to implement Ponzi schemes,
but they can also be used to implement Initial
Coin Offerings, games such as Cryptokitties,
and gambling parlors.
Further, in On-Chain Vote Buying and the Rise
of Dark DAOs, Philip Daian and co-authors
show that smart contracts also provide for
untraceable on-chain collusion in which the
parties are mutually pseudonymous.
Smart contracts are programs, and programs
have bugs.
Some of the bugs are exploitable vulnerabilities.
Research has shown that the rate at which
vulnerabilities in programs are discovered
increases with the age of the program.
The problems caused by making vulnerable software
immutable were revealed by the first major
smart contract.
The Decentralized Autonomous Organization,
the DAO, was released on the 30th of April
2016, but on the 27th of May 2016 Dino Mark,
Vlad Zamfir, and Emin Gün Sirer posted A
Call for a Temporary Moratorium on The DAO,
pointing out some of its vulnerabilities;
it was ignored.
Three weeks later, when The DAO contained
about 10% of all the Ether in circulation,
a combination of these vulnerabilities was
used to steal its contents.
The loot was restored by a hard fork, the
blockchain’s version of mutability.
Since then it has become the norm for smart
contract authors to make them “upgradeable,”
so that bugs can be fixed.
“Upgradeable” is another way of saying immutable
in name only.
So, this is the list of people your permissionless
system has to trust if it’s going to work
as advertised over the long term.
So where have we ended up?
You started out to build a trustless, decentralized
system but you have ended up with a trustless
system that trusts a lot of people you have
every reason not to trust; a decentralized
system that is centralized around a few large
mining pools that you have no way of knowing
aren’t conspiring together; an immutable
system that either has bugs you cannot fix,
or is not immutable; a system whose security
depends on it being expensive to run, and
which is thus dependent upon a continuing
inflow of funds from speculators; and a system
whose coins are convertible into large amounts
of fiat currency via irreversible pseudonymous
transactions, which is thus an irresistible
target for crime.
If the price keeps going up, the temptation
for your trust to be violated is considerable.
If the price starts going down, the temptation
to cheat to recover losses is even greater.
So maybe it’s time for a re-think.
Suppose you give up on the idea that anyone
can take part and accept that you have to
trust a central authority to decide who can
and who can’t vote.
You will have a permissioned system.
The first thing that happens is that it is
no longer possible to mount a Sybil attack,
so there is no reason running a node need
be expensive.
You can use BFT to establish consensus, as
IBM’s Hyperledger, the canonical permissioned
blockchain system, does.
You need many fewer nodes in the network,
and running a node just got way cheaper.
Overall, the aggregated cost of the system
got orders of magnitude cheaper.
Now there is a central authority.
It can collect fiat currency for network services
and use it to pay the nodes.
No need for cryptocurrency, exchanges, pools,
speculators, or wallets, so much less temptation
for bad behavior.
This is now the list of entities you trust.
Trusting a central authority to determine
the voter roll has eliminated the need to
trust a whole lot of other entities.
The permissioned system is more trustless,
and since there is no need for pools, the
network is more decentralized despite having
fewer nodes.
How many nodes does your permissioned blockchain
need?
The rule for BFT is that 3f + 1 nodes can
survive f simultaneous failures.
That’s an awful lot fewer than you need for
a permissionless proof-of-work system.
What you get from BFT is a system that, unless
it encounters more than f simultaneous failures,
remains available and operating normally.
The problem with BFT is that, if ever it encounters
more than f simultaneous failures, the state
of the system is irrecoverable.
If you want a system that can be relied upon
for the long term, you need a way to recover
from disaster.
Successful permissionless blockchains have
Lots Of Copies Keeping Stuff Safe, so recovering
from a disaster that doesn’t affect all of
them is manageable.
So in addition to implementing BFT, you need
to back up the state of the system each block
time, ideally to write-once media so that
the attacker can’t change it.
But if you’re going to have an immutable backup
of the system’s state, and you don’t need
continuous uptime, you can rely on the backup
to recover from failures.
In that case, you can get away with, say,
two replicas of the blockchain in conventional
databases, saving even more money.
I’ve shown that, whatever consensus mechanism
they use, permissionless blockchains are not
sustainable for very fundamental economic
reasons.
These include the need for speculative inflows
and mining pools, security linear in cost,
economies of scale, and fixed supply versus
variable demand.
Proof-of-work blockchains are also environmentally
unsustainable.
The top five cryptocurrencies are estimated
to use as much energy as The Netherlands.
This isn’t to take away from Nakamoto’s ingenuity.
Proof-of-work is the only consensus system
shown to work well for permissionless blockchains
at scale.
The consensus mechanism works, but energy
consumption and emergent behaviors at higher
levels of the system make it unsustainable.
So now there’s time for questions.