Consider using mathematically generated text as (one of) the test cases for data compression methods. For example:
Look and say sequence: 1 11 21 1211 111221 312211 13112221 1113213211 ...
Dragon curve: RRLRRLLRRRLLRLL...
Gray code.
Regular counting, in any base.
The sequences have a lot of repetition, but do not repeat in any obvious periodic way, which is often a characteristic of data one wants to compress.
A quick test on 1 GB of Dragon Curve found that bzip2 compressed much better than gzip or xz but was also much slower.
Also human genome.
No comments :
Post a Comment