Monday, December 01, 2008

64-bit Zobrist hashes insufficient?

As computers get faster, I worry that 64-bit Zobrist hashing may have too many collisions.

To be clear, there are two types of hash collisions. After calculating a 64-bit Zobrist hash, of course one does not use a full 2^64 entry hash table. Instead, one truncates, for example using "mod" to the size of the hash table, and collisions may result from the truncation. Such hash collisions may be detected by storing the key (the original 64-bit hash) at the hash table location.

The second, more fundamental collision, the one I am worried about, is that two different positions may have the same 64-bit Zobrist hash. This kind of collision will not be detected by the method above. Because of the Birthday Paradox, these collisions become likely after "only" 2^32 (four billion) positions, a number that fast computers these days are approaching (and surpassing) for the number of position evaluations per move. This is especially true in analysis, where you might leave your computer on all night evaluating a position, and it will rack up way more than 2^32 nodes evaluated.

The simple solution is to use a wider Zobrist hash, for example 128-bit.

No comments :