Alexander Chepurnoy

The Web of Mind

Forging Simulation Tool for a Proof-of-Stake Cryptocurrency

| Comments

Introduction

I’m the proud member of Consensus Research. Recently we published two executable simulations of forging(term for generating blocks activity in Proof-of-Stake). Multibranch forging simulation has the paper about it. I’m going to describe Nxt-like single-branch forging simulation tool in series of posts, some blogposts about multibranch will be published later also to give simplified description of the tool.

The Tool

The simulation tool is published on GitHub along with short description. Some internal details described in the Inside a Proof-of-Stake Cryptocurrency Part 4: The Executable Forging Simulation article. I don’t want to repeat myself here, so let’s take a closer look on how the tool works and what results it dumps out.

After launching the program it prints out “Starting cryptocurrency simulation…” then does its job silently and dumps out few results to the console when simulation is done:

Alt text

So it dumps out a balance of every node from it’s own balances sheet and common sub-chain length of a node with each other node. The screenshot above shows that most of nodes are agreeing on a chain with 121 or 122 blocks, while there’s a cluster of nodes having 130 blocks in common chain and one node stuck in fork at the moment of the end of an experiment with 116 common blocks.

A lot of detailed data goes into output folder, readme describes files going there.

Forging Quality

Forging quality seems to be worse than in real-world implementations e.g. Nxt mainnet blockchain: big forgers producing few blocks in a row more often(especially after connecting to a network with a genesis state), common chain is behind the current moment. I see some reasons for that:

  • in Nxt a node waits for blockchain to be downloaded then starts to forge, in simulation not
  • there are no latencies, clocks de-synchronization, also all processes are tied to the world clocks, so in case two nodes generate blocks at the same time a hard battle between forks has begun
  • Nxt Reference Software has some additional checks and patches not being presenting in the model

But our goal wasn’t to replicate Nxt forging and propagation processes, so it’s okay for now to have what we have.

Performance

Performance is bad at the moment. And I did not optimizations for the reason: for now understandability is much more important.

Simulating Different Models

Using the simulation tool a researcher can make experiments and get results quickly. For example, to change cumulative difficulty measure from sum of 1/baseTarget to sum of 1/(baseTarget^2) just one line modification needed:

cumulativeDifficulty chain = sum(map (\bl -> two64 / ((fromIntegral $ baseTarget bl)**2) ) chain)

And from first look things are getting worse after that.

Further Improvements

  • Better output, e.g. CSV format usage to work with results in Excel or statistics tools
  • Some aggregated metrics, e.g. number of forks living in a network, overall network consensus quality etc
  • Latency modelling
  • For now nodes stay online forever, better to model going offline at random moments
  • Better performance

Usage

  • We already using the mix of this tool with multibranch forging simulation to play with Nothing-at-Stake issues(a paper on that is ready and will be published soon)

  • We can make experiments very quickly (see “Simulating Different Models” section above)

  • As result, community can compare different forging models without long debates, e.g. compare Nxt & Qora forging, or see improvement proposals in action

Inside a Proof-of-Stake Cryptocurrency Part 4: The Executable Forging Simulation

| Comments

Introduction

In first 3 parts Nxt-like 100% proof-of-stake forging algorithm was described. It would be exciting to run executable simulation of it to see network-wide consensus in action. And now this simulation tool is published! Please take a look to it’s README for details, and this article is about some aspects of the source code unrevealed yet.

Modules

All the code is split into three modules(files):

  • blockchain.hs, module Blockchain.Structures - basic structures and functions, building bricks for the simulated cryptocurrency network & its forging algorithm

  • simulation.hs, module Blockchain.Simulation - structures & functions related to simulation

  • launcher.hs. module Main - some auxiliary functions & main function running simulation and dumping results into files

Most of basic structures and functions were decscribed in first 3 parts of the series, so now we’re going to take a look into simulation functions.

systemTransform

The most important function in simulation module is systemTransform taking current state of a network along with timestamp and common data for the simulation and transforming it to resulting network state by applying some partial transformers:

systemTransform :: SimulationData -> Network -> Network
systemTransform sd network = networkForge sd $ generateTransactions sd $
            propagateLastBlock sd $ propagateTransactions sd $ downloadBlocksNetwork sd $
        dropConnections sd $ generateConnections sd $ addNode sd $ addInvestorNode sd network

Where

  • addInvestorNode makes some initial investor online. Network starts with just one online investor out of 20, system balance is spread uniformly in a genesis block. Max number of investors online (14,000 seconds after genesis and on) are 13, means 65% forging power being online

  • addNode adds a node to the simulated network with no blockchain & empty account. The node needs to download a blockchain from other nodes.

  • generateConnections generates a connection for a node to another node if number of outgoing connection is less than a limit given in settings file(20 by default)

  • dropConnections works once per 60 seconds, drop oldest connection for a node, if number of outgoing connections equals to the limit

  • downloadBlocksNetwork selects random neighbour for each peer and check it’s cumulative difficulty, and if this metrics is better for the neighbour, the node is going to download a chunk of blockchain after common chain(14,400 blocks max)

  • propagateTransactions selects random neighbour for each peer and sends transactions from peer’s unconfirmed transactions pool a naighbour doesn’t have

  • propagateLastBlock selects nodes having last block generated by other node not more than 15 seconds ago and sends it out to a random neighbour

  • generateTransactions generates random transactions sending max 5% stake out(for account having more than 200 coins out of 1,000,000,000 issued)

  • networkForge asks each node whether it has a right to generate a block(by it’s own opinion) and if so generates blocks for forgers found and sends them out to random neighbours

Nothing-at-Stake simulation

By mixing multibranch forging simulation tool published previously with blockchain forging simulation tool described in the article it’s possible now to simulate Nothing-at-Stake attack, e.g. by mixing nodes doing blockchain forging with “attacker” doing multibranch forging. We’ll publish our work on that soon.

Conclusion & Further Work

Well, it’s the last article in the “Inside a Proof-of-Stake Cryptocurrency” series. All of them:

And executable code corresponding is on GitHub: https://github.com/ConsensusResearch/ForgingSimulation. Some modifications will be made, I want to replace all the constants with configurable parameters, at least. Performance optimization is also needed, for now it’s performance isn’t good for simplicity’s sake.

Inside a Proof-of-Stake Cryptocurrency Part 3: A Local Ledger

| Comments

Pre-Introduction: Consensus Research Group

After previous article I started to work with friend of mine, andruiman on deeper investigation of proof-of-stake. We are working now as Consensus Research group and have successful Phase 1 crowdfunding almost done.

We have published paper on multibranch forging, while I continue to describe a model of 100% Proof-of-Stake single-branch forging.

Introduction

Cryptocurrency ledger lives in p2p network and as any information shared in AP distributed environment it’s a subject of inconsistency. So we will talk about local ledger as there’s no global ledger in a network though algorithms behind a cryptocurrency try to build some common view under some assumptions.

Local Ledger

In pt. 1 global network-wide structure System was defined slightly wrong for simplicity’s sake:

data System = System {
        nodes :: [Node],
        connections :: [Connection],
        accounts :: [Account]
    }

For now this simplification becomes too rude, so we’re going to remove accounts and rename the structure to Network (connections definition also changed but just for performance’s sake)

data Network = Network {
    nodes :: [Node],
    connections :: Map Node [Node]
}

New field to be injected into node to store it’s local state is localView,

data Node =
    Node {
        localView :: LocalView,
        unconfirmedTxs :: [Transaction],
        account :: Account          -- simplification - one account per node
        isForging :: Bool
    }

and consist of two entities:

data LocalView =
    LocalView{
        nodeChain :: BlockChain,        -- simplification - one blockchain per node(but true for NRS)
        balances :: Map Account Int
    }

The latter entity containing balances for known accounts isn’t strictly necessary but without it simple question “what’s the account balance now?” will need for whole local blockchain to be processed to answer.

To update balances during transaction processing we define ‘applyTx’ transformer function:

applyTx :: Transaction -> Map Account Int -> Map Account Int
applyTx tx blns = addMoney amt (recipient tx) $ addMoney (-amt) (sender tx) blns
    where amt = amount tx

where auxiliary function addMoney just adds some diff value to an account balance:

addMoney ::  Int -> Account ->  Map Account Int -> Map Account Int
addMoney diff acc blns = insert acc ((findWithDefault 0 acc blns) + diff) blns

To process a block, we process each transaction it contains then add sum of all transaction fees to a block generator balance:

processBlock :: Block -> Map Account Int -> Map Account Int
processBlock block priorBalances = appliedWithFees
    where
        txs = transactions block
        txApplied = foldl (\bs tx -> applyTx tx bs) priorBalances txs
        fees = sum(map fee txs)
        appliedWithFees = addMoney fees (generator block) txApplied

A Local Ledger & The Network Consensus

In a proof-of-stake environment chance to generate block & incoming block check result are depend on a balance of an account. But information about balances is local, so all decisions are also made locally. E.g. If cluster of nodes have own view of network, they will reject blocks from other nodes and build own blockchain. Is that good or bad? Well, the question is open, as well as other consensus properties in proof-of-stake networks.

Further Work

Next chapter will be about building blocks of an executable simulation of Nxt-like single-branch forging. Sources of the simulation tool will be published the same time.

Inside a Proof-of-Stake Cryptocurrency Part 2: Forging Algorithm

| Comments

Introduction

Welcome to the second part of “Inside a Proof-of-Stake Cryptocurrency” series. First chapter defined basic structures specific to proof-of-stake(but not only) p2p cryptocurrencies. We’re going deeper now into details of a forging algo. Provided information is more specific to Nxt and Nxt-like 100% PoS currencies. It will be more code in this chapter so in the first place let’s have a closer look at some of its properties.

Functions & Types

(This section could be skipped while reading for the first time)

Everything is a function in Haskell. E.g. even txTimestamp “field” of Transaction record is a function in fact. So having value tx of Transaction type, you can get it’s timestamp by calling the function as txTimestamp tx.

Every function definition(except of implicitly defined functions e.g. record fields) starts with signature describing types of arguments and result. E.g. validate :: Transaction -> Bool signature means “function validate accepts an argument of Transaction type and returns boolean value(i.e. true or false). In the same manner calcGenerationSignature :: Block -> Account -> ByteString literally means “function calcGenerationSignature is producing ByteString value (i.e. array of bytes) from instances of Block and Account types”.

A function in Haskell is pure means the output is determined by inputs only. In case of purity violation output should be wrapped into a monad to make function pure again. Hopefully, no knowledge of monads is needed for this chapter.

Why does that matter in the context of the article?

  • Even from signatures provided in the article it’s obvious the algorithm is deterministic
  • Before reading function body try to understand its signature i.e. entities going in and out
  • All the code provided below is about field getters and pretty simple self-explanatory functions mostly. I hope the code is understandable even without Haskell knowledge.

A Chance to Be Richer

The goal of forging algorithm is to choose accounts(the only one in best case) having right to generate a block. Obviously, an algorithm should be:

  • verifiable
  • deterministic (or it will be not verifiable in cryptocurrency environment)

Target and Hit

Forging algorithm reminds distributed implementation of rock-paper-scissors game. Every account willing and having ability to forge generates pseudo-random value being called hit every second. If it’s less than stake-dependent targetvalue, a node the account is running against is going to generate a block. Other parties are preventing cheat by checking both values against incoming block, as hit and target are both deterministic.

Hit

Our first goal is to produce deterministic account-dependent hit value with fair distribution. Do you remember generationSignature :: ByteString field in Block structure? There are two purposes to have it:

  1. To link a block with a previous one.
  2. It’s also being using to generate hit

The formula for generationSignature is pretty simple: it’s the hash from generationSignature of previous block attached with generator’s public key. Following code calculates it using previous block and candidate account:

calcGenerationSignature :: Block -> Account -> ByteString
calcGenerationSignature prevBlock acct = bytestringDigest(sha256(append gsPrev pk))
    where pk = publicKey acct
          gsPrev = generationSignature prevBlock

Then hit is just a number constructed from first 8 bytes of generation signature:

calculateHit :: Block -> Account -> Integer
calculateHit prevBlock account =  fromIntegral $ runGet getWord64le first8
    where first8 = take 8 (calcGenerationSignature prevBlock account)

For genesis block generationSignature is set to some predefined value e.g. filled with zeros.

Target & Hit Verifying

To generate target value for our rock-paper-scissors game we use last block again. More preciously, we multiply its baseTarget field value by generator stake(i.e. effective balance) and elapsed time since last block generation timestamp(in seconds). Then calculated target is being used in verifyHit function having two use cases:

  • Honest forging account decides whether it has a right to forge a block by calling the function. Checking is to be done each second(and each second target increases along with a chance to generate a block by satisfying hit < target condition).
  • Other parties ensure incoming block is forged by a proper account have a right to do it

    verifyHit :: Integer -> Block -> Timestamp -> Integer -> Bool verifyHit hit prevBlock timestamp effBalance = (hit < target) && (eta > 0)

      where eta = timestamp - blockTimestamp prevBlock
            target = effBalance*(baseTarget prevBlock)*eta
    

Again, if a result of function call is true, a generator is going to form a block and push it to the network. Other nodes can check whether block is generated properly by calling the same function.

BaseTarget & Block Constructing Function

Now it’s time to define baseTarget :: Integer field of a block used as an initial value for a forger to generate its own target in previous section. As well as generationSignature, baseTarget is also depends on a previous block’s field value with some predefined constant for genesis block. In case of Nxt the base target value for the genesis block is:

initialBaseTarget :: Integer
initialBaseTarget = 153722867

Why 153722867? Well, chapter 3 of this series will give precise formula describing that. The baseTarget of a block is baseTarget of a previous block multiplied by elapsed time since previous block generation timestamp, in minutes. It’s also bounded to be not less than half of a previous block field value nor more than twice as high as it. Also it couldn’t be less than 1 and more than maxBaseTarget which is:

maxBaseTarget :: Integer
maxBaseTarget = initialBaseTarget * systemBalance

Having rules to calculate generationSignature and baseTarget we can define the function to form a block based on previous block, generator account, block generation timestamp and transactions to be included:

formBlock :: Block -> Account -> Timestamp -> [Transaction] -> Block
formBlock prevBlock gen timestamp txs =
    Block{transactions = txs, blockTimestamp = timestamp, baseTarget = bt, generator = gen, generationSignature = gs}
    where prevTarget = baseTarget prevBlock
          maxTarget = min (2*prevTarget)  maxBaseTarget
          minTarget = max (prevTarget `div` 2)  1
          candidate = prevTarget*(timestamp - (blockTimestamp prevBlock)) `div` 60
          bt = min (max minTarget candidate) maxTarget
          gs = calcGenerationSignature prevBlock gen

Difficulty

Within blocktree canonical blockchain is the path having max value (e.g. height in case of Bitcoin). We will use sum of 1/baseTarget to select canonical blockchain from possible options and call this function cumulativeDifficulty:

cumulativeDifficulty :: BlockChain -> Double
cumulativeDifficulty chain = sum(map (\bl -> 1 / (fromIntegral $ baseTarget bl) ) chain)

Transparent Forging

There’s a lot of buzz about transparent forging these days. The transparent forging concept is about a forger of a next block is to be known for all online network members in prior. It’s possible because the algorithm is deterministic but in practice there are some issues to be resolved:

  • A forger could miss its turn(there are some workarounds to make block generation quicker for a next forger though)
  • List of forging accounts for the whole network is needed. And it’s not possible to have it consistent constantly.
  • Even in case of pretty weak consistency it could be a tricky task to implement gossiping about forgers in privacy-friendly way.

Conclusion & Further Work

This chapter of the series defined functional building blocks and whole forging algorithm as well. Next chapter will be kind of academic paper about statistical modelling of Nxt forging algo with some interesting results. After that we will go deeper into forging simulation with Haskell.

Inside a Proof-of-Stake Cryptocurrency Part 1: Basic Structures

| Comments

Introduction

Pure and hybrid Proof-of-Stake cryptocurrencies are on the rise today. The most known pure PoS example, Nxt, has tens of forks at the moment (some of them are quite promising), the Ethereum team thinks about hybrid (PoS/PoW) mining principle, and so on. At the same time, there are many concerns about the concept, not formalized well enough, unfortunately, for example for example even the famous “nothing-at-stake” attack has not a good description. The sad thing is both good and bad things about proof-of-stake have no good formal description and model behind. The challenge of this series is to make some progress in this area.

Prerequisites

I assume you know some common things about Bitcoin and proof-of-work mining. It’s better to have basic coding skills to understand code snippets. Haskell programming language is being using for them (though Haskell knowledge isn’t required). We will start with model as simple as possible. I prefer to make a simplest thing harder than to work with complicated case from day one.

Time

Wikipedia defines time as “a measure in which events can be ordered from the past through the present into the future, and also the measure of durations of events and the intervals between them”. For distributed environments working with time means headache because of clock synchronization, events ordering etc. Fortunately blockchain-based networks make some problems easier. Due to this our model could be even easier: we assume all clocks in the system are synchronized! Fortunately, we will lose not too much in modelling quality because of that. Later we can introduce local clocks.

Time in our approach is just a number of seconds from genesis block, so our first entity definition is pretty simple:

type Timestamp = Integer

Transaction Model

Unlike the Bitcoin-like model where transaction has inputs connected with other transaction’s outputs as well as own outputs(possibly unspent), we will work with simpler approach. First, we introduce the Account entity:

data Account =
    Account {
        balance :: Integer,
        publicKey :: ByteString,
        isForging :: Bool
    }

It has an explicitly defined balance. Many simplifications made in comparison with real cryptocurrencies, in particular:

  • We use no addresses. In real cryptocurrencies the address is hash(publicKey), while we use publicKey itself as an address.

  • We even don’t use the private key! The private key is needed for transaction signing, and that is out of scope of our study.

Having Timestamp and Account entities described, we can define transaction as:

data Transaction =
    Transaction {
        sender :: Account,
        recipient :: Account,
        amount :: Integer,
        fee :: Integer,
        txTimestamp :: Timestamp
    }

Again, it’s far away from practical use. There’s no id, no even signature, so anyone could print money easily.

Forging

In proof-of-stake currencies there are no third-party “workers”, instead, stake holders are block generators. Instead of term mining we’ll use forging. The meaning is the same though: forging is a process of blocks generation. Remember the Account definition? The isForging field there shows whether account is participating in forging or not.

Blockchain

In every cryptocurrency I know transactions are grouped into blocks. With the following block definition Proof-of-Stake nature of our model starts to appear:

data Block =
    Block {
        transactions :: [Transaction],
        generator :: Account,
        blockTimestamp :: Timestamp
        baseTarget :: Integer,
        generationSignature :: ByteString
    }

transactions and blockTimestamp fields are self-describing. generator is account generated this block. baseTarget and generationSignature fields are related to forging algorithm and will be described in a next chapter of the tutorial.

The Blockchain is just a sequence of blocks:

data BlockChain = BlockChain { blocks :: [Block] }

Effective Balance

In proof-of-stake the probability of block generation depends on the account’s balance. In Nxt-like currencies, it is a function of the effective balance . E.g. in Nxt itself the effective balance is the balance 1440 blocks ago from last one minus all spendings for that period plus leasing forging balance given by other accounts etc. We can simplify again:

effectiveBalance :: Account -> Integer
effectiveBalance acc = balance acc

So the effective balance in our model is just the current balance.

The Network

Nothing has been said about whole p2p network yet. The first entity to be defined here is a node in the network:

data Node =
    Node {
        nodeChain :: BlockChain,
        unconfirmedTxs :: [Transaction],
        account :: Account
    }

Again, we’re dealing with simplifications:

  • Only one blockchain per node(this is true though e.g.for current Nxt Reference Software implementation)

  • Only one account per node

Having the Connection entity defined as just a tuple of Nodes:

type Connection = (Node, Node)

we can define biggest type in our model, System:

data System =
    System {
        nodes :: [Node],
        connections :: [Connection],
        accounts :: [Account]
    }

In proof-of-stake currencies all the money is usually being “printed” in a genesis block. The Nxt system has ~ 1 Billion balance from the genesis block:

systemBalance :: Integer
systemBalance = 1000000000

Forks

Forks are possible as well as with proof-of-work mining. How could we define fork? Well, a blockchain is stored in a node, and System has a list of nodes([Node]), so tree of blocks could be exctracted from System by function with following signature(implementation is missed for now).

 blockTree :: System -> BlockTree

Where Blocktree is just the alias for a list of Blockchain: type BlockTree = [BlockChain]

Why is a list called a tree? All blockchains have some common prefix(genesis block at least), so in fact we’re dealing with a tree structure. However, not enforcing tree structure by type system is not the best design definitely, but for simplicity’s sake we leave it as is for now.

Properties Analysis

How can Haskell code helps us with formal model analysis? Let’s start with a property given in Gavin Wood’s Ethereum YellowPaper :

The canonical blockchain is a path from root to leaf through the entire block tree. In order to have consensus over which path it is, conceptually we identify the path that has had the most computation done upon it, or, the heaviest path.

Canonical blockchain could be defined with a function having signature:

canonicalBlockchain :: System -> Maybe BlockChain

(again, no implementation for now). Maybe BlockChain could be Just blockchain(so defined), or Nothing(means undefined). So if function result is defined a system has some canonical blockchain. We can then enforce more strict conditions, e.g canonical blockchain for some time ago should be the prefix of current canonical blockchain.

There are two ways to make conclusions about the property:

  1. We can make executable forging algo model then gather statistics about function result.

  2. We can translate our Haskell model some Haskell-like language with dependent types(e.g. Coq) then, thanks to Curry–Howard correspondence, theorems could be formulated and proven about model properties.

Both ways will be shown in action in next chapters!

Conclusion & Further Work

Data structures needed for model (Timestamp, Account, Transaction, Block, BlockChain, Node, Connection, System, BlockTree) are described along with simplest functions over them. In next chapter forging algo functions will be defined, then executable imitation and formal analysis will be provided.

(Next part: Forging Algorithm)

Inside a Cryptocurrency / Inside the Nxt - Announce

| Comments

Have been working as Nxt developer for few months(and investigating cryptocurrencies before) I see a lack of understanding of proof-of-stake currencies and Nxt as well amongst many developers in the industry. To change this to a better I’m starting two blog series:

Inside a Cryptocurrency

You could read the series title as “Inside a PoS Cryptocurrency”. In Proof-of-stake(PoS further) currencies right to generate a block out of unconfirmed transactions is given to stakeholders, not miners, and chance of generation is proportional to an actual balance. PoS has many pros: probably faster block generation, less energy footprint, 51% stake is usually more expensive than 51% mining power so PoS currency could be more secure etc. On other side, there are some issues, the most known is so-called “Nothing-at-Stake”. Unlike previous attempts, I will describe Nothing-at-Stake attack in most formal way to the moment.

Articles in this series will be about very simple account-based proof-of-stake cryptocurrency model with code examples in Haskell. The model is exctracted from Nxt, but true for other 100%-PoS currencies I know.

Inside Nxt

Second series will contain shorter articles about NRS(Nxt Reference Software), it’s interaction with outer world, my https://github.com/kushti/NxtScala. Also I would like to give out some information for those who willing to contribute and some critics for sure!

Frequency & Contribution

I’ll post one or two articles per week probably in “Inside a Cryptocurrency”. I would be happy if someone could helps me with schemes or very simple pictures drawing. I don’t sure about “Inside Nxt” updates schedule at the moment.

Shareholders Meeting via Blockchain in 20 Lines of Code

| Comments

Seems we are witnesses of a new revolution in the corporate world caused by blockchain technologies. We’re moving quickly to virtual corporations, distributed autonomous corporations, smart contracts, new models of funding and so on. This article is about an exciting possibility of having distributed shareholders meetings where results are public and counted in trustless way.

Since late April I’m working on NXT cryptocurrency as a core developer. NXT is a 100% proof-of-stake cryptocurrency aiming “to transform a cryptocurrency into a decentralised financial platform that supports a thriving and fast-growing digital economy” (unlike hybrid Ripple’s approach which isn’t fully decentralized). Since May, 12th NXT has Asset Exchange (NXT AE) as a core feature. Anyone can issue assets on this totally decentralized and uncontrolled exchange and trade them. You can watch the assets being issued in web via several sites, e.g. NextBlocks, NXTReporting or SecureAE (you can even buy stocks with Bitcoins there). Spending some time on assets investigation you can find many possible applications of NXT AE, and many more are undiscovered yet.

In particular, you can fund your business via asset issuance getting money from anyone (not just angels/VCs) like KickStarter, but from day one and with no any fees. Probably you’ll get money from a community having a need for you product. It would be awesome to get feedback and community-powered decisions also, right?

With voting system I have finished recently that will be possible. The voting system will be introduced in NRS (NXT Reference Software) 1.3.x (x>0) or 1.4.0. The next stable release version is 1.2.1, so voting API & GUI will be accessible sometime in September, probably (earlier in testnet). But I can share some API details right now as they will be untouched probably.

Consider the example of an indy developer making a game. He got some funds via asset exchange. Then he made promising v.1 of the game. A concept is still raw as well as an implementation but fans are excited! So he wants to ask the community whether he should polish v.1 or make v.2 immediately built on top of better concept, or… So he’s starting the poll:

    val question = "Further directions in game development. "
    val description = "I got some great reviews. I have 10000 NXTs left. How should I spend them?"

    val finishBlockHeight = Nxt.getBlockchain.getHeight + 1440 // ~= 1 day
    val options = Array("Improve graphics and release it ASAP", "Start to work on v.2 to get funds for it's development", "Better ask experts, e.g. attend GameDev Conference", "Game is abortive. Stop working on it.")

    val optionModel = Poll.OPTION_MODEL_CHOICE
    val votingModel = Poll.VOTING_MODEL_ASSET

    val pb = new PollBuilder(question, desc, options, finishBlockHeight, optionModel, votingModel)
    val assetId: Long =  // assetId here
    pb.assetId(ai)
    pb.optionsNumRange(1, 1) // only 1 option to choose

    issueTxToGodId(new Attachment.MessagingPollCreation(pb), phrase1)

A vote could be sent with following code:

    val poll = Poll.getByName(question).head
    val vote = Array(0.toByte, 1.toByte, 0.toByte, 0.toByte)
    val attachment = new Attachment.MessagingVoteCasting(poll.getId, vote)
    issueTxToGodId(attachment, phrase2) 

1440 blocks after starting the poll we can get and print to console poll results(where 1 asset=1 vote):

    val pr = PollResults.get(poll.getId).get
    pr match {
        case cpr: nxt.PollResults#Choice =
            val m = cpr.getResults.toMap
            println(m)      
    }

So we have fair, cheap, public & distributed shareholders voting! Just imagine how it could change the world of business.

Faster Cosine Similarity Between Two Dicuments With Scala&Lucene

| Comments

For calculating cosine similarity between two documents I used modified(and rewritten to Scala) example by Sujit Pal/Mark Butler(http://sujitpal.blogspot.ch/2011/10/computing-document-similarity-using.html, http://stackoverflow.com/questions/1844194/get-cosine-similarity-between-two-documents-in-lucene?rq=1). It’s working, but under big workload(hundreds of comparisons in second in my case) it consumes a lot of CPU / memory and even worse, some of threads getting stuck for few minutes(!) somewhere inside the Lucene code. So I had to rewrite that code to avoid RAMDirectory / IndexReader usage(I need no to store documents to Lucene storage at all). Here are two functions I have written and now want to share:

  1. extractTerms function which extracts terms from document(presented as String) with stemming & stopwords removing:

     def extractTerms(content: String): Map[String, Int] = {    
         val analyzer = new StopAnalyzer(Version.LUCENE_46)
         val ts = new EnglishMinimalStemFilter(analyzer.tokenStream("c", content))
         val charTermAttribute = ts.addAttribute(classOf[CharTermAttribute])
    
         val m = scala.collection.mutable.Map[String, Int]()
    
         ts.reset()
         while (ts.incrementToken()) {
             val term = charTermAttribute.toString
             val newCount = m.get(term).map(_ + 1).getOrElse(1)
             m += term -> newCount       
         }
    
         m.toMap
     }  
    
  2. similarity function which calculates similarity between two term vectors

     def similarity(t1: Map[String, Int], t2: Map[String, Int]): Double = {
         //word, t1 freq, t2 freq
         val m = scala.collection.mutable.HashMap[String, (Int, Int)]()
    
         val sum1 = t1.foldLeft(0d) {case (sum, (word, freq)) =>
             m += word ->(freq, 0)
             sum + freq
         }
    
         val sum2 = t2.foldLeft(0d) {case (sum, (word, freq)) =>
             m.get(word) match {
                 case Some((freq1, _)) => m += word ->(freq1, freq)
                 case None => m += word ->(0, freq)
             }
             sum + freq
         }
    
         val (p1, p2, p3) = m.foldLeft((0d, 0d, 0d)) {case ((s1, s2, s3), e) =>
             val fs = e._2
             val f1 = fs._1 / sum1
             val f2 = fs._2 / sum2
             (s1 + f1 * f2, s2 + f1 * f1, s3 + f2 * f2)
         }
    
         val cos = p1 / (Math.sqrt(p2) * Math.sqrt(p3))
         cos
     }
    

To calculate cosine similarity between text1 and text2 just call similarity(extractTerms(text1), extractTerms(text2))

In tests my code is 7-10 times faster and has less memory footprint. Enjoy!

Investigating Namecoin Database Pt. 1 : Introduction & Objects Counting

| Comments

Introduction

Probably you get some knowledge about cryptocurrencies from all that massive buzz around. But cryptocurrencies are not only buzz about another merchant accepting Bitcoin, market news, fun story involving dogecoin etc. A lot of tech possibilities we have now behind just money as digital property being transferring from one peer to another.

One of the notable alternative cryptocurrencies is Namecoin, which could be considered as currency and key-value database as well with auto-expiration of records injected in the blockchain. Owner of record has to pay (0.01 NMC for now) to have it in database for period of time of 36,000 blocks(~250 days).

You can insert any name-value record in the database, though there are some formal namespaces, two for now: DNS system for .bit domains(names starting with “d/”) and NameId which is fully decentralized OpenId alternative(names starting with “id/”).

Objects Counting

We are starting to get deeper into the Namecoin database with counting named objects in it.

First, install Namecoin daemon namecoind (very like Bitcoin’s bitcoind installation), for details, visit namecoin.info or namecoin.org . Wait while blockchain will be downloaded.

Then pull out all active records into out.txt file with

namecoind name_scan "" 1000000 > out.txt

To count all objects in database dump you have now use

cat out.txt | grep '"name"' | wc -l

To count .bit domains being registered

cat out.txt | grep '"name" : "d/' | wc -l

To count NameId entries

cat out.txt | grep '"name" : "id/' | wc -l

As of today, Feb 26th, 2014, Namecoin database contains 156276 entries in total, including 140237 .bit domains and 2919 NameIds.

Flattening Scala Futures(Future[Future[T]] –> Future[T])

| Comments

Well, as scala.concurrent.Future class has no .flatten method, you could be wondered how to convert Future of Future e.g. Future[Future[T]] to just Future[T]. That’s can be easy done with combination of flatMap and identity functions:

//having fft:Future[Future[T]]
val ft:Future[Int] = fft.flatMap(x=> x)