Scalability has been a perennial stumbling block for blockchain, the much-hyped distributed ledger technology. That’s because as a blockchain grows, so, too, does the amount of data stored on each computer in the peer-to-peer network.
It’s a critical issue, because blockchain has outgrown its cryptocurrency roots and is now poised to re-shape supply chains, parts of the FinTech world, real estate and a host of other industries.
To achieve its full potential, blockchain has to be able to grow exponentially without becoming too slow or bogging down the computers on which it runs. That means making it scalable.
And that’s where "sharding" comes in.
Sharding is one of several methods being tested by start-ups, developers and current blockchain platforms such as Ethereum to see if it can help blockchain developers finally climb the scalability mountain. To understand sharding, and how it might help, you first have to understand a little more about blockchains.
One of the main problems with public blockchains involves something called “consensus protocols” based on “proof of work.” This is what underlies how transactions are authenticated; a majority of the blockchain users have to agree that proposed transactions are authentic and can be added to the chain.
In other words, there has to be a consensus. But consensus algorithms used by the likes of bitcoin and the Ethereum payment network are highly compute intensive. They use a LOT of CPU cycles.
Further complicating things: Proof of Work-based blockchains, require each authenticating computer or node to records all the data on the chain because it’s part of the consensus process. But as more transactions occur, and the blockchain grows, more computing cycles are needed. And everything. Slows. Down.
How slow? Well, bitcoin can only process 3.3 to 7 transactions per second – and a single transaction can take 10 minutes to complete. Ethereum is a little faster; it can process from 12 to 30 transactions per second. But that’s nothing compared to Visa’s 50-year-old electronic payment network, VisaNet. It processes around 1,700 transactions a second.
In order to compete with VisaNet and other conventional networks in terms of scalability and performance, blockchain needs turbocharging.
That’s where sharding comes in.
It’s been around for a while. Originally designed for horizontal database partitioning, sharding is a way of spreading out the computing and storage workload from a blockchain network so that each node no longer has to process the entire network's transactional load. Each node only maintains the info related to its specific partition, or shard.
Sharding allows a blockchain to remain decentralized and secure, two of things that make it so popular. The information in a shard can still be shared and everyone can see all the ledger entries. But every node is freed from recording and storing all data on every other node. That allows data to be stored more quickly and makes it easier to find because its location is mapped on the blockchain. Because fewer nodes now “see” and process transactions, more transactions can be processed in parallel.
While sharding could be the key to allowing blockchains to scale securely, hurdles remain.
For one, if you’re going to maintain blockchain security, you have to guard against what are called shard takeovers. (Corrupting the nodes in a given shard will lead to the permanent loss of data. That would be very bad.)
Ethereum tackles the security issue by randomly assigning a node to a shard – and then randomly reassigning those nodes to other shards.
A second challenge involves "thin" clients, also called SPV (Simplified Payment Verification) wallets, and ensuring that nodes have the full picture of the blockchain’s current state while it's divided among shards. To address that issue, thin clients communicate via separate networks and maintain local state copies for each shard.
And finally, it’s important to note that inter-shard communication, while good for security, still poses a challenge – because each shard appears as a separate blockchain network.
So, you can see that while sharding has the potential to eliminate a lot of the scaling problems blockchain has, it’s still very much in the development-and-testing phase. Pretty much like blockchain itself.