Ken's blog: [pfzuviov] Merkle Tree state

Consider a Merkle Hash Tree computation which arranges the computation as a complete binary tree with all the data at the lowest level. We leave unstated the chunk size for each leaf.

Assume the data arrives online: we don't know how much data there is in total, and we don't get it all at once. How do we keep track of the state of a partial computation? Roughly, if the subtree below a node is a full binary tree, then the hashes of its children and further descendants can be forgotten: only the hash at the node is needed. Amount of space required to store state is probably O(log n) where n is the amount of data so far.

We would also like to keep and report the first few levels of the tree as final output. If a subsequent computation of the same data gets a different result, comparing the first few levels of the tree gives some indication of where things went wrong. As more data arrives, a node which was in the first few levels of the tree may stop being so, so should be forgotten at that point.

A parallel computation requires even more bookkeeping.

Motivation was to design a large, parallel computation task (for testing a system) in which a single error anywhere will propagate and be visible in the final output. (Inspiration was computation of digits of pi.) Input data for such a test can be integers in increasing order. (Update: modular exponentiation is a better way.)

Ken's blog

Sunday, November 17, 2019

[pfzuviov] Merkle Tree state

No comments :