After few weeks of devoting every spare minute to one of my pet projects – there it is! – version 0.1.0 of Web3scala – a library, that allows seamless integration with Ethereum blockchain, using Scala programming language.
Details can be found on web3scala.org and the library itself is available in Maven Central, as a dependency with following coordinates: “org.web3scala” % “core” % “0.1.0”.
While reading a book on Blockchain recently (Daniel Drescher’s “Blockchain Basics”), i made following notes:
- The term blockchain is ambiguous; it has different meanings for different people, depending on the context. It can refer to:
- A data structure
- An algorithm
- A suite of technologies
- A group of purely distributed peer-to-peer systems with a common application area
- Blockchain can be thought of as a purely distributed peer-to-peer system of ledgers, managed by an algorithm, which negotiates the informational content of ordered and connected blocks of data, in order to achieve and maintain its integrity. Managing and clarifying ownership is the most prominent application case of the blockchain (but not the only one).
Concepts and principles of a ledger
B. PROBLEM AREA
- Problem of ownership:
- A proof of ownership has three elements:
- Identification of the owner
- Identification of the object being owned
- Mapping the owner to the object
- ID cards, birth certificates, and driver’s licenses as well as serial numbers, production dates, production certificates, or a detailed object description can be used in order to identify owners and objects.
- The mapping between owners and objects can be maintained in a ledger, which plays the same role as a witness in a trial.
- Having only one ledger is risky since it can be damaged, destroyed, or forged. In this case, the ledger is no longer a trustworthy source for clarifying ownership.
- Instead of using only one central ledger, one can utilize a group of independent ledgers for documenting ownership and clarify requests concerning the ownership on that version of the reality on which the majority of ledgers agrees.
- It is possible to create a purely distributed peer-to-peer system of ledgers by using the blockchain-data-structure. Each blockchain-data-structure represents one ledger and is maintained by one node of the system. The blockchain-algorithm is responsible for letting the individual nodes collectively arrive at one consistent version of the state of ownership. Cryptography is used to implement identification, authentication, and authorization.
- Integrity of a purely distributed peer-to-peer system of ledgers is found in its ability to make true statements about ownership and to ensure that only the lawful owner can transfer his or her property rights to others.
- Problem of double spending
- Double spending can refer to:
- A problem caused by copying digital goods
- A problem that may appear in a distributed peer-to-peer system of ledgers
- An example of violating the integrity of distributed peer-to-peer systems
- It’s a vulnerability of purely distributed peer-to-peer systems of ledgers, and blockchain is a means to solve this problem
- The core problem to be solved by the blockchain is achieving and maintaining integrity in a purely distributed peer-to-peer system that consists of an unknown number of peers with unknown reliability and trustworthiness.
- In order to design a purely distributed peer-to-peer system of ledgers for managing ownership, one has to address the following tasks:
- Describing ownership
- Protecting ownership from unauthorized access
- Storing transaction data
- Preparing ledgers to be distributed in an untrustworthy environment
- Forming a system of distributed the ledgers
- Adding and verifying new transactions to the ledgers
- Deciding which ledgers represent the truth
Concepts of ownership
- Transaction data provide the following information for describing a transfer of ownership:
- An identifier of the account who initiates the transaction and is to transfer ownership to another account
- An identifier of that account that is to receive ownership
- The amount of the goods to be transferred
- The time the transaction is to be done
- A fee to be paid to the system for executing the transaction
- A proof that the owner of the account who hands off ownership agrees with that transfer
- The complete history of transaction data is an audit trail that provides evidence of how people acquired and handed off ownership.
- Any transaction not being part of that history is regarded as if it never happened.
- A transaction is executed by adding it to the history of transaction data and allowing it to influence the result of aggregating them.
- The order in which transaction data are added to the history must be preserved in order to yield identical results when aggregating these data.
- In order to maintain integrity, only those transaction data are added to the blockchain-data-structure that fulfill the following three criteria:
- Formal correctness
- Semantic correctness
- Identifying data from their digital fingerprint
- Hash functions transform any kind of data into a number of fixed length, regardless of the size of the input data.
- There are many different hash functions that differ among others with respect to the length of the hash value they produce.
- Cryptographic hash functions are an important group of hash functions that create digital fingerprints for any kind of data.
- Cryptographic hash functions exhibit the following properties:
- Provide hash values for any kind of data quickly
- One-way usage
- Collision resistant
- Hash values can be used:
- To compare data
- To detect whether data that were supposed to stay unchanged have been altered
- To refer to data in a change-sensitive manner
- To store a collection of data in a change-sensitive manner
- To create computationally expensive tasks
- The major goal of cryptography is to protect data from being accessed by unauthorized people. Main cryptographic activities are:
- Encryption: Protecting data by turning them into cypher
text by utilizing a cryptographic key
- Decryption: Turning cypher text back into useful data by utilizing a matching cryptographic key
Calculating hash values
- Asymmetric cryptography always uses two complementary keys: cypher text created with one of these keys can only be decrypted with the other key and vice versa. When utilizing asymmetric cryptography in real life, these keys are typically called the public key and private key in order to highlight their role. The public key is shared with everyone, while the private key is kept secret. For this reason, asymmetric cryptography is also called public- private-key cryptography.
- There are two classical use cases of public and private keys:
- Everyone uses the public key to encrypt data that can only be decrypted by the owner of the corresponding private key. This is the digital equivalent to a public mailbox where everyone can put letters in but only the owner can open it.
- The owner of the private key uses it to encrypt data that can be decrypted by everyone who possesses the corresponding public key. This is the digital equivalent to a public notice board that proves authorship.
- The blockchain uses asymmetric cryptography in order to achieve two goals:
- Identifying accounts: User accounts are public cryptographic keys.
- Authorizing transactions: The owner of the account who hands off ownership creates a piece of cypher text with the corresponding private key. This piece of cypher text can be verified by using the corresponding public key, which happens to be the number of the account that hands off ownership.
- Digital signatures serve two purposes:
- Identify its author uniquely
- State agreement of its author with the content of a document and authorize its execution
- In the blockchain, digital signatures of transactions are cryptographic hash values of transaction data encrypted with the private key that corresponds to the account that hands off ownership.
- Digital signatures in the blockchain can be trace back uniquely to one specific private key and to one specific transaction in one process.
There are two keys: a white key and a black key. Together they form the pair of corresponding keys. The original message is encrypted with the black key, which yields cypher text represented by the black box containing white let- ters. The original message can also be encrypted with the second key, which yields different cypher text represented by the white box containing black letters. For didactical reasons, the colors of the boxes representing cypher text and the colors of the keys used to produce them are identical in order to highlight their relation: The black key yields black cypher text, while the white key produces white cypher text. Black cypher text can only be decrypted with the white key and vice versa. The trick to asymmetric cryptography is that you can never decrypt cypher text with the key that was used to create it.
E. DATA STRUCTURE
- The blockchain-data-structure is a specific kind of data structure that is made up of ordered units called blocks.
- Each block of the blockchain-data-structure consists of a block header and a Merkle tree that contains transaction data.
- The blockchain-data-structure consists of two major data structures: an ordered chain of block headers and Merkle trees.
- One can imagine the ordered chain of block headers as being the digital equivalent to an old-fashioned library card catalog, where the individual catalog cards are sorted according to the order in which they were added to the catalog.
- Having each block header referencing its preceding block header preserves the order of the individual block headers and blocks, respectively, that make up the blockchain-data-structure.
- Each block header in the blockchain-data-structure is identified by its cryptographic hash value and contains a hash reference to its preceding block header and a hash reference to the application-specific data whose order it maintains.
- The hash reference to the application-specific data is typically the root of a Merkle tree that maintains hash references to the application-specific data.
Simplified blockchain-data-structure containing four transactions
F. STORING DATA
- The steps to be performed in order to add new transaction data to the blockchain-data-structure are:
- Create a new Merkle tree that contains all new transaction data to be added.
- Create a new block header that contains both a hash reference to its preceding header and the root of the Merkle tree that contains the new transaction data.
- Create a hash reference to the new block header, which is now the current head of the blockchain- data-structure.
- Changing data in the blockchain-data-structure requires renewing all hash references starting with the one that directly points to the manipulated data and ending with the head of the whole blockchain-data-structure as well as all hash references in between them.
- The blockchain-data-structure pursues a radical all-or-nothing approach when it comes to changing its data: One either changes the whole data structure completely starting from the point that causes the change until the head of the whole chain or one better leave it unchanged in the first place.
- All half-hearted, halfway through, or partial changes will leave the whole blockchain-data-structure in an inconsis- tent state, which will be detected easily and quickly.
- Changing the blockchain-data-structure completely is a very elaborate process on purpose.
- The high sensitivity of the blockchain-data-structure regarding changes is due to the properties of hash references.
G. DATA STORE PROTECTION
- The blockchain protects the history of transaction data from manipulation and forgery by storing transaction data in an immutable data store.
- The history of a transaction is made immutable by utilizing two ideas:
- Storing the transaction data in the change-sensitive blockchain-data-structure, which when being changed requires rewriting the data structure starting at the point that causes the change until the head of the whole chain.
- Requiring the solution of a hash puzzle for writing, rewriting, or adding every single block header in the blockchain-data-structure.
- The hash puzzle is unique for each block header because it depends on its unique content.
- The need to rewrite the blockchain-data-structure when it is changed and the costs of doing so make it unattractive to manipulate the history of transaction data in the first place.
- Requiring the solution of a hash puzzle for every writing, rewriting or adding of block headers in the blockchain-data- structure turns is into an append-only data store.
- A block header contains at least the following data:
- A hash reference to the header of its preceding block
- The root of a Merkle tree that contains transaction data
- The difficulty of its hash puzzle
- The time when solving the hash puzzle was started
- The nonce that solves the hash puzzle
Hash puzzle required to be solved when adding a new block to the blockchain-data-structure
H. VERIFYING AND ADDING TRANSACTIONS
- The blockchain-algorithm is a series of rules and instructions that governs the way in which transaction data are processed and added to the system.
- The challenge solved by the blockchain-algorithm is to keep the system open to everyone while ensuring that only valid and authorized transactions are added.
- The blockchain-algorithm utilizes the carrot-and-stick approach, combined with competition and peer control.
- The major idea of the blockchain-algorithm is to allow all nodes of the system to act as supervisors of their peers and reward them for adding valid and authorized transactions and for finding errors in the work of others.
- Due to the rules of the blockchain-algorithm, all nodes of the system have an incentive to process transactions correctly and to supervise and point out any mistakes made by the other peers.
- The blockchain-algorithm is based on the following concepts:
- Validation rules for transaction data and block headers
- Reward for submitting valid blocks
- Punishment for counteracting the integrity of the system
- Competition among peers for earning reward based on processing speed and quality
- Peer control
- The rules of the competition establish a two-step rhythm that governs the work of every node in the network.At any given point in time, all nodes of the system are in either of the two phases:
- Evaluating a new block that was created by others
- Trying hard to be the next node that creates a new block that has to be evaluated by all others
- The working rhythm is imposed by the arrival of messages at the individual nodes.
- The majority of honest nodes and their striving for reward will outweigh the attempts of dishonest nodes to counteract the integrity of the system.
I. TRANSACTION HISTORY CHOICE
- Delays in sending new blocks across the network or two nodes creating new blocks nearly at the same time cause the blockchain-data-structure to grow into the shape of a tree or a columnar cactus with branches that arise from a common trunk that represent conflicting versions of the transaction history.
- Selecting an identical version of the transaction history is a collective decision-making problem.
- Distributed consensus is an agreement among the members of a purely distributed peer-to-peer system in a collective decision-making problem.
- The collective decision-making problem of the blockchain is characterized by the following facts:
- All nodes operate in the identical environment, consisting of the network, nodes that maintain their individual copies of the blockchain-data-structure, and the blockchain-algorithm that governs the behavior of the nodes.
- The decision-making problem is to select the identical transaction history across all nodes.
- All nodes strive to maximize their individual income earned as a reward for adding new valid blocks to the blockchain-data-structure.
- In order to achieve their goals, all nodes send their new blocks to all their peers to have them examined and accepted. As a result, each nodes leaves its individual footprint in the environment that is the collectively maintained blockchain-data-structure.
- All nodes use the identical criterion for selecting a history of transaction data.
- The longest-chain-criterion states that each node independently chooses the path of the tree-shaped blockchain-data-structure that contains the most blocks.
- The heaviest-chain-criterion states that each node independently chooses that path of the tree-shaped blockchain-data-structure that has the highest aggregated difficulty.
- Selecting one path of the tree-shaped blockchain-data- structure has the following consequences:
- Orphan blocks
- Reclaimed reward
- Clarifying ownership
- Reprocessing of transactions
- A growing common trunk
- Eventual consistency
- Robustness against manipulations
- The deeper down the authoritative chain a block is located:
- The further in the past it was added
- The more time has passed since its inclusion in the blockchain-data-structure
- The more common effort has been spent on adding subsequent blocks
- The less it is affected by random changes of the blocks that belong to the longest chain
- The less likely it will be abandoned
- The more accepted it is by the nodes of the system
- The more anchored it is in the common history of the nodes
- The fact that certainty concerning the inclusion of blocks in the authoritative chain increases as time goes by and as more blocks are added eventually is called eventually consistency.
- A 51 percent attack is an attempt to gather or control the majority of the whole voting power in a collective decision-making process with the goal to turn blocks that are part of the authoritative chain into orphan blocks and establish a new authoritative chain that contains a transaction history that is more favorable from the attackers point of view.
- A 51 percent attack has the following characteristics:
- Economically: Changing the allocation of ownership rights by changing the collective history of transaction data.
- Decision making: Gathering the majority of voting power in order to enforce a desired result.
- Technically: Undermining the integrity of the system.
- Architecturally: Establish at least temporarily a hidden element of centrality that changes the state of the system.
- The blockchain utilizes fees for compensating its peers for contributing to the integrity of the system.
- The instrument of payment used to compensate peers (e.g. Bitcoin), has an impact on major aspects of the blockchain such as:
- The distributed nature
- The philosophy of the system
- Desirable properties of an instrument of payment for compensating peers are:
- Being available in digital form
- Being accepted in the real world
- Being accepted in all countries
- Not being the subject to capital movement restrictions
- Being trustworthy
- Not being controlled by one single central organization or state
- A cryptocurrency is an independent digital currency whose ownership is managed by a blockchain that uses it as an instrument of payment for compensating its peers for maintaining the integrity of the system.
- The openness of the blockchain and the absence of any form of central control are the fundamentals of its functioning but can also cause limitations for its adoption.
- Major technical limitations of the blockchain are:
- Lack of privacy
- The security model
- Limited scalability
- High costs
- Hidden centrality
- Lack of flexibility
- Critical size
- The most important nontechnical limitations of the blockchain are:
- Lack of legal acceptance
- Lack of user acceptance
- Technical limitations of the blockchain can be overcome by improving the existing technology or by introducing conceptual changes.
- The nontechnical limitations of the blockchain can be overcome by educational and legislative initiatives.
L. OTHER TYPES OF BLOCKCHAIN
- The blockchain inherently contains the following conflicts:
- Transparency vs. privacy: On the one hand, transparency is needed for clarifying ownership and preventing double spending, but on the other hand, its users require privacy.
- Security vs. speed: On the one hand, protecting the history of transaction data from being manipulated is done by utilizing the computationally expensive proof of work, but on the other hand, speed and scalability are required in most commercial contexts.
- The transparency vs. privacy conflict has its root in the allocation of reading access rights to the blockchain-data-structure.
- The security vs. speed conflict has its root in the allocation of writing access rights to the blockchain-data-structure.
- Solving the transparency vs. privacy conflict led to the following versions of the blockchain:
- Public blockchains grant reading access and the right to create new transactions to all users or nodes.
- Private blockchains limit reading access and the right to create new transactions to a preselected group of users or nodes.
- Solving the security vs. speed conflict led to the following versions of the blockchain:
- Permissionless blockchains grant writing access to everyone. Every user or node can verify transaction data and create and add new blocks to the blockchain- data-structure.
- Permissioned blockchains grant writing access only to a limited group of preselected nodes or users that are identified as trustworthy through an on- boarding process.
- Combining these restrictions pairwise led to the emergence of four different kinds of blockchains.
- Restricting reading or writing access results in consequences on the following properties of the blockchain:
- The peer-to-peer architecture
- The distributed nature
- Its purpose
- The blockchain-technology-suite causes value even in restricted environments for the following reasons:
- The number of nodes can vary due to technical failures or downtime.
- Every distributed system faces the adversaries of networks that make communication on the level of individual messages unreliable.
- Even an on-boarding process may not guarantee the trustworthiness of nodes at a 100 percent level.
- Even trustworthy nodes may yield wrong results due to technical failures.
M. BLOCKCHAIN USAGE
- The blockchain can be considered a purely distributed data store with additional properties such as being immutable, append-only, ordered, time-stamped, and eventually consistent.
- Being a generic data store means that the blockchain can store a wide range of data, which in turn makes it usable in a wide range of application areas.
- Based on its properties, we can identify the following generic-use patterns of the blockchain:
- Proof of existence
- Proof of nonexistence
- Proof of time
- Proof of order
- Proof of identity
- Proof of authorship
- Proof of ownership
- Specific application areas of the blockchain that have already received attention or may receive attention in the future are:
- Digital assets
- Digital identity
- Notary services
- Compliance and audit
- Record management
- When analyzing specific blockchain applications or blockchain services, some questions need to be answered:
- What kind of blockchain is used?
- Are the requirements for using the blockchain fulfilled?
- What is the added value of using a purely distributed peer-to-peer system?
- What is the application idea?
- What is the business case?
- How are peers compensated for contributing resources to the system?
- The blockchain has been and will continue to be the subject of further improvements and developments such as variations in its implementation, improving efficiency, improving scalability, and conceptual advances.
- Smart contracts, zero-knowledge proofs, and alternative ways to achieve consensus are major areas of conceptual advancement of the blockchain.
- Besides it technical merits, the blockchain may be honored for the following long-term accomplishments:
- Streamlining processes
- Increased processing speed
- Cost reduction
- Shift toward trust in protocols and technology
- Making trust a commodity
- Increased technology awareness
- Possible disadvantages of the blockchain are:
- Lack of privacy
- Loss of personal responsibility
- Loss of jobs
- Possible usages of the blockchain to be seen in the future are:
- Limited enthusiast projects
- Large-scale commercial projects
- Governmental projects
- The blockchain is a purely distributed peer-to-peer system that addresses the following aspects of managing ownership:
- Describing ownership: History of Transaction Data
- Protecting ownership: Digital Signature
- Storing transaction data: Blockchain-Data-Structure
- Preparing ledgers for being distributed: Immutability
- Distributing ledgers: Gossip-Style Information Forwarding Through a Network
- Processing new transactions: Blockchain-Algorithm
- Deciding which ledger represents the truth: Distributed Consensus
- Analyzing the blockchain involves the following aspects:
- The application goal
- Its properties
- Its internal functioning
- The blockchain has two application goals:
- Clarifying ownership
- Transferring ownership
- The blockchain fulfills its application goals while exhibiting the following qualities:
- Highly available
- Censorship proof
- Eventually consistent
- Keeping integrity
- Internally the blockchain consists of components that are either specific or agnostic to the application goal of managing ownership.
- The application-specific components of the blockchain are:
- Ownership logic
- Transaction data
- Transaction processing logic
- Transaction security
- The application-agnostic components are:
- The blockchain-technology-suite
- The purely distributed peer-to-peer architecture
- The blockchain-technology-suite consists of:
- Storage logic
- Consensus logic
- Data processing logic
- Asymmetric cryptography
If you find the subject of Blockchain interesting and would like to get more in-depth understanding of it, i strongly encourage you to read Daniel Drescher’s “Blockchain Basics”.
Below is probably the quickest way to achieve this
1. Generate SSH key (if you don’t have one already)
ssh-keygen -t rsa
2. Use SSH to create a remote directory ~/.ssh
ssh email@example.com mkdir -p .ssh
3. Append your public key to .ssh/authorized_keys on remote host
cat ~/.ssh/id_rsa.pub | ssh firstname.lastname@example.org 'cat >> .ssh/authorized_keys'