Thursday, April 28, 2022

[qmuawezq] information content of balanced bits

among all 128-bit numbers, 7% have exactly 64 set bits.  uniformly randomly sampling one such number provides 124.17 bits of entropy.  among n-bit numbers (n even), the number with balanced bits is the central binomial coefficient.  the proportion is asymptotically sqrt(2/(pi*n)), so decreases slowly with larger n.  (incidentally, the relation can be solved for pi, yielding 3.153888 for n=128.)

we compute the entropy of being close to balanced:

s=0; z=64; log(sum(i=z-s,z+s,binomial(2*z,i)))/log(2)

64 +- 1 = 63-65 set bits out of 128: 125.7 bits
64 +- 2: 126.4
64 +- 3: 126.9
64 +- 4: 127.2

130 bits, exactly 65 set: 126.2 bits
132 bits, exactly 66 set: 128.1 bits
156 bits (52*3), exactly 78 set (52*3/2) set: 152.0 bits

so 3 decks of 52 cards, paying attention only to red (diamond or heart suit) or black (club or spade), can store 152 bits.

3 related questions: how can one encode and decode arbitrary data to balanced bitstrings?  related: 8b/10b encoding.

how can one sample a balanced bitstring efficiently?  shuffling (say) 64 red cards and 64 black cards works, but consumes more entropy then necessary.

other direction: given a random balanced bitstring, efficiently extract its entropy.

both of these could be solved by a hypothetical algorithm that computes both directions of a one-to-one map between balanced bitstrings and consecutive integers.

if we had paid attention to more than just red or black: one deck log(52!)/log(2) = 225.6 bits; three separate decks log(52!)*3/log(2) = 676.7 bits; three merged distinct decks log((52*3)!)/log(2) = 916.4 bits.  and then 156 bits more if card rotation by 180 degrees can be distinguished, e.g., non-symmetric back.

No comments :