The Ledger behind Bitcoin
Bitcoin means a lot of things to a lot of people. Investors cannot get enough of it, while large companies are increasingly adopting its underlying technology. Countless startups and even unicorns are built from its underlying tech. Therefore, it is not an exaggeration to say each participant has a view on this phenomenon.
At Infinite Lambda, we view Bitcoin as a distributed ledger. And as with all other ledgers, a well engineered solution could lower the cost and reduce the friction of a transaction.
Let us first start with a centralised ledger. Here, the goal is to keep track of all transaction histories and thereby approvе and reject transactions as quickly as possible. Further considerations are on horizontal scalability and easy recoverability from errors.
Simply persisting the end result (account balances) in a database is difficult to scale out, and would be hard to recover from errors. Therefore, an event sourcing pattern is introduced to persist all the transactions as a source of truth in a strict order, while only creating projections (i.e. reduction of all the transactions into a snapshot state up to a particular point in time) when it is required. This enables the creation of multiple projections, therefore achieving horizontal scalability. On top of this, all the transactions themselves are persisted giving the opportunity to replay such transactions whenever an error occurs. We have a real world case study of building a core banking system for fintech, where you can see more details.
This pattern has been extensively used by major banks in their core banking systems for more than a decade. Bitcoin implementation on its chains of transactions blocks, apart from the new innovation in cryptography, would be essentially an implementation of the event sourcing pattern.
Nevertheless, in this highly mature space, new innovations are also being adopted in order to enhance the performance and the availability of the event-sourcing ledger.
First, on the performance of the ledger, we can improve the latency to million transactions per second by using an efficient data structure like LMAX disruptor, which minimises garbage collection in an embedded system. When taking into account the end-to-end latency, new network communication layers like Aeron, which are UDP based, have greatly improved the network performance. Even on message protocol, we are seeing more binary and positional codec like Simple Binary Encoding significantly adding to the end-to-end speed. Lowering of cost in memory also allows us to keep the projection in off-heap memory instead of disk.
Second on the availability of the ledger, we are seeing a wider adoption of event sourcing ledger in new fintech. Unlike the traditional users who are exchangers and brokers, these fintech companies may not require that extreme in terms of performance. However, fintech nowadays would like to go for cloud native architecture. In the past, delivering a high performance, event-based ledger would prove a challenge. With the managed Kafka and Redis solution from major cloud providers like AWS, and with the growing capabilities of Kubernetes, we have delivered such a solution with the majority of the components as managed services or in K8s clusters. These make the high performance ledger ever more available in the cloud for all the fintechs.
Event Sourcing in Centralised Ledger
Cryptocurrencies, such as Bitcoin, have gained great popularity lately and there has been no shortage of articles explaining the workings of cryptocurrency and its underlying technology, Blockchain. This article would view the issue from a new perspective: from Infinite Lambda which has been building high performance ledger for clients in fintech and banks.
Transactions are the founding blocks of any financial system, and cryptocurrency is no exception. Transactions are mutations on the system state usually happening between two accounts with events such as:
- Spending a currency
- Receiving a currency
- Paying fees
All of the transactions since the inception of the ledger will form an event stream. Therefore, by replaying these transactions one by one in strict order, one can always arrive at the same ledger state (i.e. account balances across all participants). This is the fundamentals of event sourcing ledger.
Why is Blockchain different from other ledgers?
First of all, blockchain is still a ledger that sums up all the transactions since the inception of the ledger. Simply put, the first transaction (which is aggregated into a block consisting of multiple transactions) is called the genesis block. Each of the transactions represents a mutation of the ledger, mainly in spending, receiving and fees. Apart from the implementation difference that the spending is recorded as immutable signing of the coin by the owner, instead of a line of immutable bank record, every transaction can be conceptualised as follows.
However, there is a major difference between the traditional ledger we build for banks and blockchain. In a traditional ledger scenario, the ledger itself is held in one centralised place. For example, we will persist all the transactions in a Kafka log. The scalability is achieved via the pub/sub nature of the underlying technologies; you can create an unlimited number of materialised views but, logically, the source of truth is still centralised in one place.
In a blockchain scenario, we would not have one centralised source of truth. Instead, Bitcoin relies on consensus protocol, meaning that the source of truth is agreed by multiple nodes in the network. This works with the assumption that a majority of nodes are honest nodes.
One scenario is to visualise the difference in double spending. When a user double spends their money, this would be prevented by The centralised ledger, which is usually the bank, would prevent double spending. However, with blockchain, this would be prevented by the network itself which takes away the central point of failure. This makes it very powerful in decentralising the control of the ledger, and avoiding any arbitrary changes which are not agreed by the network.
Аll of this, however, comes at the cost of performance as the confirmation of a block will require an extensive network round trip and CPU power. For more technical details, please refer to the Addendum I for an illustration using the example of Bitcoin Core C++ implementation.
Which ledger shall I use in my business case?
We have outlined both ends of the spectrum of centralised and decentralised ledger. At one end of the spectrum, we have the event sourcing ledger, which keeps a common log of all transactions. At the other end, we have the permissionless blockchain ledger, which is entirely based on consensus without any reliance and trust on authority.
Recently, there is an addition of a new distributed ledger which sits between the two – providing permissioned distributed ledger across known parties and relying on a level of trust instead of proof of work to validate the chain. We discuss the technical details in Addendum II.
Let’s compare the three technologies:
|Event sourcing ledger||Consensus Blockchain||Permissonated DLT|
Transaction per second
|Up to a million transactions per second depending on technology (e.g. LMAX disruptor). End to end throughput often in thousands TPS||1 hour delay in confirming a new block in the Bitcoin chain. Bitcoin peaks out at around 7 TPS. Ethereum is limited to around 15 TPS|| |
Hyperledger Fabrics can reach 3500 TPS and Corda can reach 1600. However end-to-end DApps we have seen usually reach around hundred TPS
|Centralised authority||Unknown and unlimited participants|| |
Limited pre-authenticated participants
|Keep track of the history and approve/reject transactions as quickly as possible; Provide auditability and compliance||To obtain consensus across all participants while removing central authority|| |
To lower the transaction cost with increased security amongst known participants who form a level of (but not complete) trust
|Tech Stack||LMAX Disruptors, Chronicle Queues, Kafka, EventStore, Memory Grid||Bitcoin Core, Ethereum Go||Hyperledger Fabrics, Corda, DAML|
At Infinite Lambda, we can tailor the best solution based on your use case, as each solution would have vastly different considerations and design patterns.
Use case of leveraging both a centralised and a decentralised ledger
Here is an example of a use case which leverages both centralised and decentralised ledgers in an organisation to lower transaction costs. Stock brokers often maintain an online, real-time ledger and matcher to handle transactions. Order book will match the buy and sell orders, while the ledger will record the transactions being committed. Such transactions will be passed to the back office system, so as to perform counterparty reconciliation and confirmation. This is featured as follows:
We could use an event sourcing ledger to record transactions real time so as to allow high throughput and low latency during business hours. There is no need to use block chain technologies because consensus is not required. Stock broker A is the party that has full definition of the source of truth in its own ledger which approves or rejects the transaction.
However, when Stock broker A and Stock broker B agree to an OTC transaction, the transaction must be confirmed and reconciled to meet regulatory requirements. Such confirmation can take days to complete and entail massive costs of documentation flowing back and forth between.
A permissioned blockchain-based solution would therefore speed up the confirmation process significantly. In the next blog post, I am going to show you how to create smart contracts to facilitate such a flow.
In this blog post, we have discussed Bitcoin and the blockchain technology behind it with a deep dive into Bitcoin core source code. We have also touched upon Infinite Lambda’s experience with building a high-performance event sourcing ledger. We have moved on to compare the centralised and decentralised ledger technology, outlining their differences and demonstrating how to use both in a typical use case.
We have ample experience working with both centralised and decentralised ledger technologies. Please talk to us about your use case and we will suggest a solution tailored to your needs.
Addendum I: Bitcoin Core Illustration
To illustrate why the distributed ledger uses more computing power and network resources than a centralised ledger, we would like to take a deep dive into the Bitcoin Core C++ implementation as an example.
Bitcoin, like any permissionless blockchain, requires the transactions to be added to the chain by consensus. Let us start with the BlockHeader, which is the header for each of the blocks on the chain.
As seen in the above code, the major fields to be populated are as follows:
- nVersion: the version of the client;
- hashPrevBlock: the hash of the previous block;
- hashMerkleRoot: the hash of the full blockchain before this block, using the recursive strategy of computing the root value of its Merkle tree, itself being a binary tree of hashes;
- nTime: the timestamp as blocks are in strict order just like any ledger;
- nBits: the current difficulty that was used to create this block;
- nNonce: the nonce added to the input to compute the hash. This one-time nonce is a random integer. And since it is a 32 bit integer, there is 2^32 = 4294967296 possibilities.
The following methods require implementation:
- GetHash: SHA-256 hash of the block with the Nonce provided. The hash itself is a 256 bit integer that cannot be bigger than 2^256 (which is a gigantic number). However, the actual hash which would be accepted must be smaller than the target decided by the network;
- GetBlockTime: As with any ledger, the order of blocks is important. Here, we use a timestamp to ensure ordering.
In order to commit a transaction (aggregated as a block) to the chain, one will have to fill out the block header above. The Merkle tree requires each node to be aware of all the transactions throughout the network in order to compute the correct value by recursion to the root. Gossip across nodes is used to obtain such knowledge, which outlines the enormous amount of network traffic needed to run a distributed permissionless ledger.
On the front of computing power, we will include another piece of code, which shows how the nonce (and hence the right hash) is computed by trial and error. Computing one hash is costly but computing a massive number of hashes using different nonces in order to meet the target is deliberately computationally expensive.
The above code illustrates Bitcoin’s proof of work. To compute an acceptable hash, which is below the target, one will have to increment extra nonce one by one, as seen in the line of ++nExtraNonce.
Once the nonce is incremented, the whole hash will have to be reperformed across the full chain, as illustrated in hashing the Merkle root. Double spending is therefore prevented, because an attacker who mutated any parts of the chain will inevitably generate an incorrect Merkle root.
The increment of nonce will continue up until we find an acceptable hash. The time to compute an acceptable hash is thus probabilistic, and in line with how small the target is. In fact, the network will monitor the mining speed and set the target so as to ensure one block is mined every 10 minutes. Given that a chain is only confirmed after six blocks, it implies one hour delay in confirming a transaction.
Addendum II : Permissioned DLT Illustration
Permissioned DLT (Distributed ledger technology) can avoid costly proof of work, while still maintaining a blockchain of transactions across different parties. This makes permissionated DLT a more efficient technology for organisations to confirm transactions beyond organisational boundaries.
Permissioned DLT would require all parties to the blockchain to be known and authenticated by each other. Early days implementation is to put the certificates of each of the parties in the genesis block. Therefore, an authenticated party can add a block to the blockchain without proof of work.
Nowadays, permissioned DLT usually uses an additional node to perform permission. In HyperLedger Fabrics, for example, an orderer node is responsible for permissioning and ordering of transactions.
Above you can see the interface of the Orderer nodes. Note that the Order method is called only once on each transaction. This injects a level of trust and centralisation, which is the premise of permissioned blockchain.
There are currently a number of implementations for Orderer nodes, including ETSD, Kafka and Solo. If we deep dive into Solo code implementation, we can see that the synchronisation is done using the Go-built-in channel as shown below:
This again highlights the synchronisation and the centralisation in a modern DLT to allow for higher transactional efficiency.
We love sharing expertise on the Blog, so take a look at our other posts too. You can also see examples of how we apply innovative tech to help companies grow through data in the Case Studies section.