first byte E: 0 .. 255. values 0, 1, 2, 3 are common.
next 2^E bytes: an integer denoting a length S.
next S bytes: payload of length S.
benefits:
- no need to worry about parsing (say) 3-byte integers. the length of the length will always be a power of 2.
- the first byte realistically can vary -- not a constant prefix -- forcing input parsers to have to consider the various possibilities, including overflow. can you handle E >= 64 (with lots of leading zeroes)? it will require multiple precision just to count the number of bytes read. reading more than 2^64 bytes through a pipe is not unrealistic.
limit of 256^2^255-1 bytes, which should be enough for everyone (~ 2.07 * 10^139427568484130471719462862803721671986013937475814382342177386385362323556246 ~= 10^10^77.14 bytes).
size width larger than necessary is permitted, e.g., E=3, 2^3 = 8 bytes to express the a length 255 when E=0, 2^0 = 1 byte suffices. "larger than necessary" is required to express 0. the byte sequence [0,0] and the byte sequence [1,0,0] both encode the empty string (and other encodings possible as well). the prefix of a given string (payload) is not unique.
need to specify the endianness of the length integer.
previously, base 10.
No comments :
Post a Comment