Wednesday, June 16, 2021

[onfvsnqp] Roundoff during color space conversion in JPEG

The JPEG standard ITU-T T.871 specifies equations for converting RGB colors (8 bits per component, 24 bits total) to and from Y'CbCr (YCbCr), also 8 bits per component, 24 bits total.  We implemented this color space conversion in Haskell, using exact Rational arithmetic and investigated differences between exact and approximate arithmetic.  (Having a rational number library which Just Works is really nice in Haskell.)

The coefficients for the conversion are given below, expressed in Haskell's syntax for rational numbers: "p % q" = p/q.  We also give the coefficients approximated in decimals for ease of lining up the coefficients with those of decimal-approximated color space conversion equations published in many other places.

to_ycbcr_matrix [[0.0000 = {0 % 1},0.2990 = {299 % 1000},0.5870 = {587 % 1000},0.1140 = {57 % 500}],[128.0000 = {128 % 1},-0.1687 = {(-299) % 1772},-0.3313 = {(-587) % 1772},0.5000 = {1 % 2}],[128.0000 = {128 % 1},0.5000 = {1 % 2},-0.4187 = {(-587) % 1402},-0.0813 = {(-57) % 701}]]

to_rgb_matrix [[0.0000 = {0 % 1},1.4020 = {701 % 500}],[-0.3441 = {(-25251) % 73375},-0.7141 = {(-209599) % 293500}],[1.7720 = {443 % 250},0.0000 = {0 % 1}]]

We investigate which RGB colors, when converted to 24-bit YCbCr and then back to RGB, map back onto themselves.

(Future work: are there color cycles of length greater than 1?  Are the specified conversion equations optimal for mapping colors back onto themselves?  Even if not and you deploy better equations, you typically only control one leg of the round trip.)

(Future work: which colors have the most round-trip error, measured in a color space such as Cielab?)

Running through all 8^3 = 2^24 = 16777216 possible RGB colors, we found that 3999890 colors mapped back onto themselves (24%).  24-bit color is actually 21.93153 bit color.

time ./a.out | perl -nlwe 'die unless /^\[(\S+)\] .* \[(\S+)\]$/;print if $1 eq $2'| nice time plzip --best > output.lz

$ plzip -dc output.lz | wc
3999890 139996150 1140200754

(plzip --best compresses the 1.14 GB output text file to 76364987 bytes (76 MB).)

A common image (and video) compression operation after mapping to YCbCr is to downscale the chrominance components Cb and Cr in spatial resolution by 1/2.  JPEG typically does this, as does YUV420p.  How many YCbCr color possibilities are there for a 2x2 block of pixels?  If the Y, Cb, and Cr components were each to use their full 8-bit range, then there would be 8 bits each for the two C components shared by all 4 pixels, and 8 bits each for the 4 Y components, for a total of 8*2+8*4 = 48 bits per 4 pixels, or an average of 12 bits per pixel.

But not all possibilities of Y, Cb, Cr occur in the 3999890 colors surviving the RGB round trip.  Only 46505 combinations of (Cb,Cr) occur.  We group the colors by (Cb,Cr).  For each (Cb,Cr), we count the number of Y values that occur.  There are 4 pixels in a 2x2 block, each can have a Y value independently, so to determine the number of ways to color a 2x2 block for a given (Cb,Cr), raise its count of possible Y values to the 4th power.  The sum of those 4th powers over all possible (Cb,Cr) is 13542893092828, or about 2^43.6.  Divide that by 4 to get an average of 10.9056503 bits per pixel, down from 12 bpp above.  The sum has prime factorization 2 * 2 * 11 * 37 * 8318730401 .

plzip -dc output.lz | perl -nlwe 'die unless /ycbcr8 (\S+)/;print$1'| sort -t, -k2n,2 -k3n,3| ./a.out ysizes | wc -l
plzip -dc output.lz | perl -nlwe 'die unless /ycbcr8 (\S+)/;print$1'| sort -t, -k2n,2 -k3n,3| ./a.out y4sum

Unfortunately, typical JPEG encoders and decoders do not do exact rational arithmetic for color space conversion.

One situation in which errors can occur is if an intermediate value before rounding to an 8-bit integer has a fractional component of exactly 1/2.  The standard specifies rounding should be floor(x+0.5), so a value INTEGER+1/2 should always be rounded to INTEGER+1.  However, floating point errors or other reasons can cause incorrect rounding.  (For example, Haskell's round function rounds INTEGER+1/2 to the nearest even integer, a.k.a. "bankers' rounding".)  Therefore, from the 3999890 colors, we remove colors which have a fractional component 1/2 in any component of its unrounded YCbCr or RGB values, because those colors might not be reliably converted.  This eliminates 478 colors, leaving 3999412 colors, 24% of the 24-bit color space, or 21.93136 bit color.  46475 combinations of (Cb,Cr) occur.  There are 13539907820900 ways to color a 2x2 block, or 10.90557 bits per pixel.  The sum has prime factorization 2 * 2 * 5 * 5 * 135399078209 .

plzip -dc output.lz | grep -v 'ishalf...=True' | wc -l
plzip -dc output.lz | grep -v 'ishalf...=True' | perl -nlwe 'die unless /ycbcr8 (\S+)/;print$1'| sort -t, -k2n,2 -k3n,3| ./a.out ysizes | wc -l
plzip -dc output.lz | grep -v 'ishalf...=True' | perl -nlwe 'die unless /ycbcr8 (\S+)/;print$1'| sort -t, -k2n,2 -k3n,3| ./a.out y4sum

An example of a color that encounters a fractional component of exactly 1/2 when converting from RGB to YCbCr is rgb:00/00/0d (0 0 13 in decimal).  An example of a color that encounters a fractional component of exactly 1/2 when converting from YCbCr to RGB is rgb:00/01/fb (0 1 251 in decimal), which went through the intermediate YCbCr color 29 253 107 (in decimal).

$ ./a.out do1 0 0 13
[0,0,13] rgb:00/00/0d [1.4820 = {741 % 500},134.5000 = {269 % 2},126.9429 = {88987 % 701}] ishalfyuv=True toocloseyuv=True ycbcr8 [1,135,127] [-0.4020 = {(-201) % 500},-0.6948 = {(-203929) % 293500},13.4040 = {3351 % 250}] ishalfrgb=False tooclosergb=False [0,0,13]

$ ./a.out do1 0 1 251
[0,1,251] rgb:00/01/fb [29.2010 = {29201 % 1000},253.1687 = {448615 % 1772},107.1719 = {150255 % 1402}] ishalfyuv=False toocloseyuv=False ycbcr8 [29,253,107] [-0.4420 = {(-221) % 500},0.9798 = {287579 % 293500},250.5000 = {501 % 2}] ishalfrgb=True tooclosergb=True [0,1,251]

Note that we are ignoring situations where errors cause a color to map onto itself even though the spec says it should not.  (Future work: investigate this.)

We experimented with the cjpeg and djpeg programs in version 1.5.2-0ubuntu5.18.04.3 of the libjpeg-turbo-progs Ubuntu package.  Color space conversion is done in jccolor.c . The comments say the all-integer implementation is done with 16-bit precision, or "about 4 digits precision", but we demonstrate it is less than that.

The color rgb:00/cf/23 (0 207 35 in decimal) yields a Y value of exactly 125.499 but the cjpeg|djpeg pipeline probably rounds that to 126.

$ ./a.out do1 0 207 35
[0,207,35] rgb:00/cf/23 [125.4990 = {125499 % 1000},76.9283 = {136317 % 1772},38.4857 = {53957 % 1402}] ishalfyuv=False toocloseyuv=True ycbcr8 [125,77,38] [-1.1800 = {(-59) % 50},206.8232 = {30351307 % 146750},34.6280 = {8657 % 250}] ishalfrgb=False tooclosergb=False [0,207,35]

$ ./a.out yuv1 126 77 38
ycbcr8 [126,77,38] [-0.1800 = {(-9) % 50},207.8232 = {30498057 % 146750},35.6280 = {8907 % 250}] ishalfrgb=False tooclosergb=False [0,208,36]

$ ppmmake rgb:00/cf/23 16 16 | cjpeg -q 100 -dct float | djpeg -dct float | pnmnoraw | head -4
P3
16 16
255
0 208 36  0 208 36  0 208 36  0 208 36  0 208 36  0 208 36

The color rgb:ff/30/dc (255 48 220 in decimal, exactly the inverse of the color above) yields a Y value of exactly 129.501, but the cjpeg|djpeg pipeline probably rounds that to 129.

$ ./a.out do1 255 48 220
[255,48,220] rgb:ff/30/dc [129.5010 = {129501 % 1000},179.0717 = {317315 % 1772},217.5143 = {304955 % 1402}] ishalfyuv=False toocloseyuv=True ycbcr8 [130,179,218] [256.1800 = {12809 % 50},48.1768 = {7069943 % 146750},220.3720 = {55093 % 250}] ishalfrgb=False tooclosergb=False [255,48,220]

$ ./a.out yuv1 129 179 218
ycbcr8 [129,179,218] [255.1800 = {12759 % 50},47.1768 = {6923193 % 146750},219.3720 = {54843 % 250}] ishalfrgb=False tooclosergb=False [255,47,219]

$ ppmmake rgb:ff/30/dc 16 16 | cjpeg -q 100 -dct float | djpeg -dct float | pnmnoraw | head -4
P3
16 16
255
255 47 219  255 47 219  255 47 219  255 47 219  255 47 219  255 47 219

These two colors had the largest rounding errors we could find, but it was not an exhaustive search.  (Future work: link to libjpeg-turbo and do an exhaustive search.  Alternatively, do a few large images packed with all possible colors in 16x16 solid-color blocks.)  (Future work: investigate other JPEG libraries.)

The following command lists all the differences with cjpeg when distance from 1/2 is in the range 1/1024 ... 2/1024. The above two examples (0 207 35 and 255 48 220) are the only 2 colors in that window that fail among the 8140 tests. Their distance from 1/2 is 1/1000, which is why we use that value in farenoughthreshold in the code and "tooclose" in the output.

time ./a.out halfoff "1%1024" 1 > p2 && wc -l p2 && time for file in $(perl -nlwae 'print$F[1]' p2) ; do ppmmake $file 16 16 | cjpeg -q 100 -dct float | djpeg -dct float | pnmnoraw | head -4 | tail -1 | perl -nlwae 'print"$F[0] $F[1] $F[2]"' ; done > p3 && perl -nlwe 'die unless /^[(\d+),(\d+),(\d+)/;print"$1 $2 $3"' p2 > p4 && diff p4 p3

From this investigation, assume that colors which have intermediate values with fractional components between 0.499 and 0.501 inclusive cannot be reliably encoded by JPEG.  Starting from 3999890 colors above, this eliminates 8628 colors, leaving 3991262, 24% of the 24-bit color space, or 21.92841 bit color.  46383 combinations of (Cb,Cr) occur.  There are 13509753605362 ways to color a 2x2 block, or 10.90477 bits per pixel.  The sum has prime factorization 2 * 7 * 37181 * 25953643 .

plzip -dc output.lz | grep -v '=True' | wc -l
plzip -dc output.lz | grep -v '=True' | perl -nlwe 'die unless /ycbcr8 (\S+)/;print$1'| sort -t, -k2n,2 -k3n,3| ./a.out ysizes | wc -l
plzip -dc output.lz | grep -v '=True' | perl -nlwe 'die unless /ycbcr8 (\S+)/;print$1'| sort -t, -k2n,2 -k3n,3| ./a.out y4sum

We exhaustively tried all 8628 eliminated colors, seeing which ones the cjpeg|djpeg pipeline correctly preserved.  6531 were preserved; 2097 were incorrect.

Major future work: provide an option to libjpeg (and derivatives) to do color space conversion correctly at the expense of speed.  This option would join the many other speed versus accuracy tradeoff options that libjpeg already presents as part of its API.

Motivation: how much data can we pack into a JPEG image?

Source code in Haskell.

Similar work by someone else: https://www.hackerfactor.com/blog/index.php?/archives/250-Showing-JPEGs-True-Color.html

No comments :