Alexander Chepurnoy

The Web of Mind

Authenticated Dynamic Dictionaries, With Applications to Cryptocurrencies

| Comments

This article is about the paper “Improving Authenticated Dynamic Dictionaries, with Applications to Cryptocurrencies” to appear at Financial Cryptography'2017. It was presented also at RealWorldCrypto'2017 and I highly recommend to watch impressive Leonid’s presentation from the conference:

Some background. Previously I worked for the Nxt platform which has assets and much more cool features. The problem is, the blockchain processing becomes incredibly heavyweight (considering pretty low number of transactions, in comparison with Bitcoin) with new features added. The same problem with Ethereum these days - after Autumn attacks, it is nearly impossible to wait until processing being finished on an ordinary laptop.

The problem is in a state (e.g. UTXO set in Bitcoin) persistence. Once it hits a secondary storage (HDD or SSD), processing becomes very slow.

Thus two considerations behind our work on AVL+ trees and a proposed scheme for cryptocurrencies:

  • It should be feasible to run a fullnode (maybe not a mining node) on commodity hardware

  • Initial blockchain processing, and then block processing must use RAM only

As commodity hardware is pretty limited in RAM, the idea is not to store the state for full nodes at all. The scheme is as follows:

  1. The state is authenticated with the help of a 2-party dynamic authenticated dictionary.

  2. A mining node is storing the whole state. When packing transactions into a block, it generates proofs of the authenticated state transformations and announce a new root hash after the transformations being done in a blockcheader. Proofs are to be included into the block.

  3. A full-node receiving the block checks that 1) transactions are correct(format & signatures are correct etc) 2) State transformation operations derived from the transactions are corresponding to the proofs 3) Proofs are correct 4) Resulting roothash (a verifier is getting it just by processing proofs) is the same as the announced one. Thus the node is checking everything, but without holding the state (e.g. UTXO set).

Then the paper is about to find a most efficient structure out of many candidates (and the winner is custom-tailored authenticated AVL+ trees).

Not mentioned in the paper but worth to say is that proofs in a block could be authenticated themselves (with the help of a Merkle tree which is perfect for static data) with a root hash included in a blockheader. Then if node is holding the state it could skip downloading proofs from the network, also there is possibility to prune them in the future (this scheme reminds the SegWit proposal for Bitcoin).

Proofs are adding a significant burden regarding block size (actually a proof can be longer than the corresponding transaction), so decreased throughput is to be considered seriously.

The code had been released during RealWorldCrypto (, section “Authenticated data structures”). There are some possible further minor optimizations (possibly reducing proof size by few percent in total) we are now discussing.

Scorex 2.0: A Full-Node View

| Comments

In this article we will go through state replication mechanism implementation abstractions of a full node in a blockchain system.

Full node is a node which holds at least some state enough to check whether an arbitrary transaction is valid against it and so applicable to it or not. We can define such a state, minimal state, as an abstract component with just one basic operation :

trait MinimalState[TX <: Transaction] {
  def apply(transaction: TX): Try[MinimalState[TX]]

Result of the apply function is whether an updated state or an error if transaction is not applicable: Success[MinimalState[TX]] | Failure(...) (Try[A] is just a sum type for these two options).

Even with such a minimalistic definition, we can formulate a first law: we cannot apply the same transaction twice. In form of ScalaCheck-based property test that could be like

forAll { (minState, tx) =>

In Bitcoin minimal state is about unspent transaction output set, and a successful transaction application is about removing outputs spent by the transaction and adding outputs from it. It is obviously impossible to apply the transaction again.

We need some starting point to start applying transactions from. We call it the genesis state. For Bitcoin, genesis state is just an empty set.

Now we want all the nodes in an open network to have the same minimal state eventually. For that, we need to save a log of transformations and be sure the log is eventually the same on all the honest nodes in presence of Byzantine adversaries.

It is achieved via blockchain log structure. We pack transactions into blocks and fix the order with hashchain-like structure. Block application to a minimal state is deterministic, so starting with the same hard-coded genesis state all the honest nodes are getting the same minimal state after applying the same blockchain. Thus they share the same view on validity of an arbitrary transaction.

Things are not so simple though. As nodes are equal and there is no any arbiter in the network some consensus protocol working in a decentralized environment is needed to append new blocks. Sometimes collisions occur, and while Bitcoin protocol is trying to ignore them, some alternatives to it (GHOST/SPECTRE) are taking explicit blocktree model to the account. Rollerchain proposal ( is aiming to achieve fullnode security if a node applying state snapshot and then some numbers of full blocks to it. Bitcoin-NG and ByzCoin are splitting blocks into empty blocks created with Proof-of-Work followed by microblocks with transactions. We generalize the notion of a log member calling it a persistent node view modifier:

trait PersistentNodeViewModifier[TX <: Transaction extends NodeViewModifier {

  // with Dotty is would be Seq[TX] | Nothing
  def transactions: Option[Seq[TX]]

and then we define abstract history

trait History[TX <: Transaction, PM <: PersistentNodeViewModifier[TX]] {
  type ApplicationResult = Try[(History[TX, PM], Option[RollbackTo[PM]])]

  def contains(block: PM): Boolean = contains(

  def append(block: PM): ApplicationResult


Note that appending a block has an optional additional side-effect that is some information about rollback performed during an append. We skip details for now.

In addition to the state modifiers log and the minimal state, a fullnode also contains two more entities. Memory pool contains transactions not yet included into blocks(and there’s no guarantee of inclusion for them). Vault contains some node-specific information a node is extracting from the log. For example, it could contain values encoded in some or all OP_RETURN instructions, or all the transactions for specific addresses. The well-known example of vault is wallet which contains private keys as well as transaction associated with their public images.

With the four entities being defined we can explicitly state a node view type now:

type NodeView = (History[TX, PMOD], MinimalState[TX, PMOD], Vault[TX], MemoryPool[TX])

And by having the compound entity we can ensure rules of its modification:

  • an offchain transaction modifies vault and memory pool. Atomicity in this update is not critical.
  • for a persistent node view modifier (blockheader, full block, key block, microblock, state snapshot) atomicity for an update is strictly needed! If history is producing rollback sude-effect, other parts must handle it properly before applying an update. This sounds trivial, but in fact many implementation are spending years fighting with bugs related to inconsistency and read-when-update issues.

That is all for now! To be continued!

P.S. Please note the real entities in Scorex 2.0 Core have more complex type signatures.

P.P.S. SPV nodes do not hold a sufficiently rich state to validate an arbitrary transaction.

RollerChain, a Blockchain With Safely Pruneable History

| Comments

When you starting a Bitcoin node it is downloading all the transactions for more than 7 years in order to check them all. People are often asking in the community resources whether it is possible to avoid that. In a more interesting formulation the question would be “can we get fullnode security without going from genesis block”?

The question becomes even more important if we consider the following scenario. Full blocks with transactions are needed only in order to update a minimal state, that is, some common state representation enough to check whether is arbitrary transaction is valid(against the state) or not. In case of Bitcoin, minimal state is unspent outputs set (we call this state minimal as a node could also store some additional information also, e.g. historical transactions for selected addresses, but this information is not needed to check validity of an arbitrary transaction). Having this state (with some additional data to perform possible rollbacks) full blocks are not needed anymore and so could be removed.

In Bitcoin fullnodes are storing all the full blocks since genesis without a clear selfish need. This is the altruistic behavior and we can not expect nodes to follow it in the long term. But if all the nodes are rational how a new node can download and replay the history?

The proposal recently put on Arxiv trying to solve the problems mentioned with a new Proof-of-Work consensus protocol. I very lightly list here modifications from Bitcoin, for details please read the paper.

  1. State is to be represented as an authenticated data structure(Merkle tree is the simple example of such a data structure) and a root value of it is to be included into a blockheader. It is the pretty old idea already implemented in Ethereum(and some other coins).

  2. We then modify a Proof-of-Work function. A miner is choosing uniformly k state snapshot versions out of last n a (sufficiently large) network stores collectively. In order to generate a block miner needs to provide proofs of possession for all the state snapshots. On a new block arrival a miner updates k+1 states, not one, so full blocks (since minimal value in k) are also needed.

Thus miners store a distributed database of last n full blocks AND state snapshots getting rewards for that activity. A new node downloads blockheaders since genesis first (header in Bitcoin is just 80 bytes, in Rollerchain 144 bytes if a hash function with 32 bytes output is chosen). Then it could download last snapshot or from n blocks ago, or from somewhere in between. It is proven that this scheme achieves fullnode-going-from-genesis security with probability of failure going down exponentially with “n” (see Theorem 2 in the paper). Full blocks not needed for mining could be removed by all the nodes. They can store them as in Bitcoin but we do not expect it anymore.

The RollerChain fullnode is storing only sliding window of full blocks thus storing disk space, also less bandwidth is needed in order to feed new nodes in the network and so bandwidth saved could be repurposed for other tasks.

The Moral Character of Cryptocurrency-Related Work

| Comments

We are going to live in the post-DAO world, whether Ethereum will be (hard or soft)-forked or not.

One of the most important questions hasn’t been answered by the inner circle of Ethereum. And it is not being asked loud enough even. The question is what’s inside the Gordian knot of corrupt ties in the inner Circle.

Former Ethereum members founded startup. Then the team started The DAO venture, partly in order to get funding for itself. The DAO was supported by many Ethereum core team members, including Vitalik.

As a result, a lot of tough question to be asked after the DAO crash. Would be (hard or soft)-fork proposed by Ethereum team if no Ethereum members participation in The DAO? Who are in the inner circle? What are the names of other projects to be saved with a fork in case of disaster?

Fortunately, I am not in the Ethereum world at all, so I do not know answers. Hopefully, some investigative journalists will dig there.

What’s interesting to me is how to avoid dubious scenarios in the future. I think we need to consider some moral ground for core developers and foundation members.

At least, it must be prohibited to work for a blockchain core and any for-profit project built on top of that at the same time, or even for some time after exiting working for a core.

Sometimes developers and foundation members are working for other projects because it is nearly impossible to pay bills developing a core product. For example, I left Nxt mostly because of pretty small rewards. For a developer with vast experience it is easy to find well-paid job. And core development requires highly skilled developers. So it is not easy to get away from multi-million ICOs. Highly-skilled developers team, security audits, consultations with academias and so on could not be cheaper than couple of USD millions. I don’t know about marketing, but I suspect it is not cheaper.

However, spendings must be transparent. Key meetings must be transparent as well, considerations behind key decisions should be described in details.

We need to re-consider governance models, again.

P.S. This post reflects my personal position only.

P.P.S. The title resembles “The Moral Character of Cryptographic Work” by P. Rogaway

Cryptocurrency State Representation: Boxes vs Accounts

| Comments

A cryptocurrency is decentralized replicated blockchain-based ledger system. What is ledger? A ledger is a state of a system in some minimal form giving us ability to answer whether a transaction could be valid against it or not. Another question a ledger should answer is whether a transaction is already included in it or not. How could be ledger represented then? There are two popular approaches.

Box Model

First is used in Bitcoin and its successors. A ledger in Bitcoin could be represented as unspent transaction outputs(UTXOs in Bitcoin jargon) list. Given a transaction it is easy to check whether it is valid: all the transaction inputs must be connected to unspent outputs, and also inputs must provide a valid solution(in form of stack machine script) to spending guard condition an UTXO has(in form of stack machine script as well). Also, sum of bitcoins associated with inputs must be not less than sum of bitcoins associated with inputs. Outputs spent are to be removed from ledger and new ones the transaction contains are to be added to it.

Abstracting the Bitcoin-like model, a state could be represented as a list of closed boxes. Each box has a value associated with it. A transaction contains keys to open some boxes and new closed boxes. A transaction is valid if all the keys in it are opening closed boxes in a ledger and sum of new closed box values is no more than sum of values of closed boxes to be open.

It is not hard to see that the second question (whether a transaction was already processed into ledger or not) could be answered trivially in the box model. If transaction was already processed, its boxes are open and so not stored in ledger anymore.

Unfortunately, there are no known box systems other that Bitcoin-like around, but I bet, we’ll see them around some day.

Account Model

While boxes state model is about immutable objects to be only created or destroyed, we can also represent state with mutable accounts. Transaction is involving existing accounts. It modifies them and maybe creates some new accounts. This approach is adopted by Nxt and Ethereum. In a simplest form, it is implemented in Simplest Transactional Module of the Scorex project. Basically the state in Simplest Transactional Module is just map (account -> balance).

While this model could answer pretty well to the question whether a transaction is valid against a state(in the simplest case above, it is just about whether an account has an aprropriate balance), the second question(whether a transaction was already included in included into state or not) becomes tricky. As a possible solution, nonce could be used, and is used in Ethereum. That is, a transaction contains always-increasing nonce(so a next transaction must have a bigger value of nonce field than a current one), and last nonce value used is to be stored into state.

Lagonaki, First Public Testnet on Top of Scorex, Has Been Launched!

| Comments

Scorex is fully open (open-sourced under public domain license) modular blockchain framework (GitHub) . Modular means you can swap a consensus or transactional part of a blockchain system or add a new p2p protocol as easy as possible. The project is supported by IOHK company (

To prove that the framework is indeed modular we have implemented few modules: Proof-of-Stake consensus, Permacoin implementation, and simplest transactional module with just tokens transfers from one pubkey to another(the only one kind of transactional modules unfortunately).

Permacoin is a consensus protocol based on non-interactive Proof-of-Retrievability scheme for a static dataset. Paper is made by Miller/Shi/Juels/Katz/Parno: .

We are launching first testnet release called Lagonaki. Lagonaki = Scorex + Permacoin + SimplestTransactions

Lagonaki Debian package and sources: Testnet seed node API is opened there:

You can also run Lagonaki in a Docker(set wallet seed & wallet password in settings.json). Readme is in the Scorex repository(you can temporarily run Lagonaki from there).

I’m filling wiki pages at the moment.

Please contribute by testing! We are also looking for contributors. In particular, it would be amazing to see Bitcoin and Ethereum-like transactional modules (possible by reusing BitcoinJ/EthereumJ code I guess). And please join developers maillist.

On Private Blockchains, Technically

| Comments


The concept of private blockchains is very popular these days. All the discussions I’ve seen were about general and economic aspects of them. I suppose it is the good time to start a
technical discussion around the concept I’ve been thinking a lot since mid-2014 when I raised “industrial blockchains” term in private conversations.

Private blockchains could have different requirements and so design. Five banks exchanging tens of millions of records per day is different story than 10,000 art galleries submitting just 5,000 common ledger updates daily.

A blockchain-based system could be seen as a set of protocols. I am going to reason how consensus and transactions protocols could be implemented in a private blockchain.


Consensus in the global Bitcoin p2p network is based on solving moderately hard computational puzzles, also known as “Proof-of-Work”. I suppose Proof-of-Work is not an appropriate choice for a private blockchain due to at least following reasons:

  • An adversary can take a control over a network easily by outsourcing computations to Bitcoin miners(it would be cheap for a network not protected by a bunch of special hardware).

  • A private blockchain solutions provider can unlikely substantiate a need for a customer to spend a ton of money on a datacenter full of ASICs in addition to software to be run on computers.

How to determine a next block generator in absence of computational puzzles? Few options are known:

  • Proof-of-Stake. Good and flexible solution for network with big number of participants. The method is suitable for non-monetary blockchain systems, in this case tokens called generation rights could be created in a genesis block. A big business can get bigger share of generation right so has bigger probability to generate a block.

  • By using a trusted blockchain as a random beacon. “On Bitcoin as a public randomness source” paper shows that Bitcoin block header could be used as the source of 32-68 random bits. Similarly, the generator signature field in an Nxt block header could be used as the source of 32 random bits. If network participants are known then a block generator could be chosen by using those random bits. There are some drawback of this approach: each node in a private blockchain should include Bitcoin SPV client(or NRS in case of Nxt), block generation delays are determined by a trusted public blockchain(so 10 minutes in average for Bitcoin, 2 minutes for Nxt).

  • By using a known Byzantine fault tolerant solution to a distributed commit problem. As this is a lonely toy of CS researchers there are a lot of possible algorithms described in papers. BFT solutions are better suitable for small networks.

Transactional Model

There are many aspects in designing transactions carrying valuable business data, in particular:

  • In some cases blocks are not really needed (if transactions ordering is important, [DAG] ( could be used).

  • If Proof-of-Stake is used for consensus, special kinds of transactions could be introduced
    to create, transfer and destroy generation rights.

  • A Bitcoin-like transaction with multiple inputs and outputs and scripts attached to both sides isn’t a good solution for most non-monetary use cases probably. Even more, this choice could lead to very inefficient and heavy processing. For example, if a system is about multiple assets tracking, it’s better to take Nxt assets-related transactions(while removing others).

  • While in Bitcoin no information about a state is stored in a block header, in many cases it’s practical to include state-related information into blocks(as Ethereum includes Merkle-Patricia Trie root hash).

A Private Blockchain Design

Since the introduction section of the article I am implicitly stating one thought to be directly stated now: One size doesn’t fit all.. I saw many trends in data storage and processing, and after working with few “silver bullets” in the past I would like to say: there is no silver bullet ever found in this area. Always think about your data and requirements around them then choose or design a tool to work with. When we are talking about such a specific data storage as blockchain some questions should be answered in prior. Here is example list:

* How many participants will be in a network? Are they equal? Are all of them are allowed 
 to change a global state(so to generate blocks)? 

* What load is planned? E.g. how many transactions per hour or day.                  

* Data model for global state(ledger) should be considered. Please note blockchain 
is replicated data structure, and there is no known work on how to shard it yet.

* Could be state designed in a way to allow some form of pruning? 

* Transaction model should be considered. How many bytes an average transaction is about? 

* What are security requirements regarding consensus? What could be tolerated?

* What are privacy requirements? Should be all the data is visible for all? 


At now we already know how to build public blockchains(with significant and sometimes critical lack of formalization though). Private blockchains banks and financial institutions are so excited about at the moment are unknown beasts we need to formulate precise questions and then provide answers yet. This article is trying to stimulate work in this direction.

Appendix 1. The Scorex Project

I am the author of the Scorex project, minimalistic blockchain engine for research purposes. I think Scorex would be useful for experiments with private blockchains.

Vandalizing on Ethereum Blockchain

| Comments

If you are already full of excitement because you can write a Turing-complete script into the Ethereum blockchain, I’m going to excite you even more - it’s possible to write any garbage instead of a contract code. Example of such a transaction is . It starts with the invalid instruction code then some random crap. The fee is minimal, as nothing was executed. So I put around 1kb into the blockchain(and few thousand nodes hard drives) forever for just 4.5 Finney ($.002 at the moment). Can’t say about bigger amount of data, as my geth client got deadly broken when I tried to submit transaction of about 60Kb.

And possibility to submit invalid data as contract code couldnt' be eliminated probably by code analysis as the EVM language has arbitrary JUMP instruction.

Towards a New Frontier of the Smart Contracts: Hawk and Enigma

| Comments


After the AT Project(incorporated into Burst and Qora) and then Ethereum launch we got the ability to store programmable logic of potentially arbitrary complexity on the blockchain and then execute it on an each node. The “world computer” approach has some very obvious problems though:

  • scalability - executing a code on an each node in a network is just impractical. With tens (or hundreds) of companies planning to build something atop of the Ethereum, the reality is all those projects can’t fit into a single blockchain probably

  • privacy - in many real-world scenarios contractors are not willing to disclose some of data or code details to the public. Current smart contracts solutions are not preserving privacy in any way

Research units around the world are trying to solve the issues. In this review first two well-known approaches, namely Hawk and Enigma will be covered.

Language-Based Approach

Both projects offer the same approach to develop contracts, a language to be compiled into a privacy-preserving onchain + offchain cryptographic protocol.


In both projects a contract is executed as on-chain + off-chain protocol so only reasonable minimum of data is going into a blockchain. Probably time will define a special kind transactional design for a blockchain better suited for storing a lot of certain factual information. At the moment both papers don’t imply blockchain implementation details.


Approach to preserve privacy is different. Enigma uses [secure multi-party computations] ( based on Shamir’s secret sharing with MACs and commitments stored on the blockchain. A Hawk contract requires minimally trusted manager which cannot change outcome of a contract and also cannot aborts a protocol without losing its security deposit(but it can disclose private information involved).

Enigma vs. Hawk

Pretty short and concise Enigma whitepaper describes a platform concept consisting of blockchain, DHT(acting as general-purpose database) & SMC. Some practical aspects of the system e.g. fees & deposits are not well described yet though.

In contrast, much bigger Hawk paper isn’t about platform description mostly, but about a solid and proven approach to construct a compiler translating high-level programs written within ideal world model into real-world cryptographic protocol. I hope a whitepaper with a platform description will be released as well. From the paper, they have working compiler already at least.


It seems Enigma team is going to release more or less concrete smart contracts platform aiming to solve the most painful problems of nowaday solutions, namely scalability and privacy. Hawk team is going to open-source the framework, though it’s not clear at the moment what will be there except of a compiler.

Both projects are very exciting and will move us to a next frontier in building a decentralized economies. So have luck both of the teams!

White Papers:

  1. Hawk: The Blockchain Model of Cryptography and Privacy-Preserving Smart Contracts. Available online at

  2. Enigma: Decentralized Computation Platform with Guaranteed Privacy. Available online at

On the Way to a Modular Cryptocurrency, Part 2: Stackable API

| Comments


The previous chapter, Generic Block Structure described how to split a blockchain-related core design of a cryptocurrency into two separate modules to wire concrete implementations in an application then.

A cryptocurrency core application provides some API for its user. In this chapter I will show how to split API implementation into pieces to wire it in an application then. The code is being used in the Scorex project, compact cryptocurrency core implementation for experiments.

Gluing Things Together

In the first place, some wrapper for Spray route is needed:

trait ApiRoute {
   val route: spray.routing.Route

Then an actor composing routes:

class CompositeHttpServiceActor(val routes: ApiRoute*) extends Actor with HttpService {

  override def actorRefFactory = context

  override def receive = runRoute( ~ _))

And then to create a new piece of API, instance of ApiRoute overriding route value is needed, see “Longer Example” in spray-routing documentation for example of a route definition(or Scorex Lagonaki sources, scorex.api.http package).

Then to glue all things together, we just create concrete actor implementation and bind it to a port. Example from Scorex Lagonaki:

lazy val routes = Seq(
  AddressApiRoute()(wallet, storedState),
  BlocksApiRoute()(blockchainImpl, wallet),

lazy val apiActor = actorSystem.actorOf(Props(classOf[CompositeHttpServiceActor], routes), "api")      

IO(Http) ! Http.Bind(apiActor, interface = "", port = settings.rpcPort)