Tuesday, March 27, 2012

[joxrneev] Revisiting double metaphone bit encoder

Revisiting this method of mapping bits to words and vice versa.

Instead of using a checksum to assign metaphone codes to the range 0..1023, run through a dictionary and assign codes explicitly so that each number has approximately the same number of words (or same weight according to word frequency).  This makes the description (much) less compact, but often gives more choices per number, allowing the formation of memorable phrases.

If there's a word that was missed, use a hash, perhaps cksum, whose reversibility might come in useful (though we can run through the entire 26^4 universe of codes).

Since small words are omitted, they may be used to fill space between the information-carrying words to form sentences.

No comments :