Roman numerals are terrible, but we explore making them even worse.
Extend additive notation the obvious way: iii=3, iiii=4, iiiii=5, iiiiii=6, vv=10, vvv=15.
Extend subtractive notation in the following (terrible) way. We describe it around the character D (500), but the full system is its generalization to all Roman numeral characters.
First, consider uD, where u is a Roman numeral string composed of one or more of the characters I V X L C (i.e., anything less than D). Interpret u (recursively) as a number. Then, uD is 500-value(u).
ix = 9. iix = 8. vix = 10 - 6 = 4.
Incidentally, iiiiiv becomes one way to express zero. So is vvx. We can also create negative numbers: vvix = -(5 + 5 + 1) + 10 = -1.
Next, consider uDvDwDx. u v w x are each strings composed of characters less than D. Evaluate as follows: value(uDvDwDx) = (500-value(u)) + (500-value(uv)) + (500+value(uvw)) + value(x). uv and uvw are string concatenation, so subtractive prefixes can get used multiple times. It feels a bit like earlier strings, e.g., u, distribute over later strings v and w.
idid = id + iid = 499 + 498 = 997. diid = 500 + 498 = 998. iidd = iid + iid = 498 + 498 = 996. idvd = id + ivd = 499 + 496 = 995. vdidxd = vd + vid + vixd = 495 + (500-value(vi)) + (500-value(vix)) = 495 + 494 + 496 = 1485.
xixixixixixixixixixix = 10 + 9 + 8 + 7 + 6 + 5 + 4 + 3 + 2 + 1 + 0 = 55.
Subtractive prefixes get used multiple times through concatenation only through one level of characters:
iv = 4. ivx = 10-value(iv) = 6, not anything bizarre like value(ix)-value(iv). ixvx = (10-value(i)) + (10-value(iv)) = 9 + 6 = 15.
This system is backward compatible with existing Roman numerals. It assigns values to strings which were previously invalid, and no previously valid strings change their value. Every string of Roman numeral characters now has a unique value. There are many ways to express any given value.
Here is Haskell source code for evaluating these "worse" Roman numerals. We also provide routines for finding the shortest representation of a given Arabic number. The shortest representation is found by breadth-first search. At the end of this post, we give the shortest representation of all numbers from -100 to 100. Some numbers have multiple possible shortest representations: we give them in a decreasing aesthetic order described in the compareroman function.
Future work: assume the minus sign is available for negation. For what negative numbers (if any) does the shortest representation not use it? Are there any positive numbers whose shortest representation does use it? (Both of these seem unlikely.) Another possibility: invent a character for zero, perhaps N for nihil or nullus. Consider expressing negative numbers with subtractive prefixes in front of N.
Future work: instead of the standard set of Roman numeral characters and their values, consider some different set of values for characters: powers of 2, Fibonacci numbers, square numbers. Does anything interesting happen? Perhaps interesting things happen when seeking the shortest representation of numbers.
We define a sequence "worstcase" that exemplifies the "worst case" of our worse Roman numeral system. Each line below has the next largest Roman numeral character interleaved between the characters of the previous line, and one more at the end.
1 I
4 IV
15 IXVX
161 ILXLVLXL
83 ICLCXCLCVCLCXCLC
5349 IDCDLDCDXDCDLDCDVDCDLDCDXDCDLDCD
-57745 IMDMCMDMLMDMCMDMXMDMCMDMLMDMCMDMVMDMCMDMLMDMCMDMXMDMCMDMLMDMCMDM
With Roman numeral characters beyond M, the sequence would continue
1304801 -42453397 2336054109 -221579717657 36896704797401 -10904184517859485 5768308016008033877 -5503513512222683409697
Getting the last 4 values required memoization in order not to run out of memory. We used Data.MemoTrie.memoFix from the MemoTrie package, version 0.6.9. It had higher performance than Data.Function.Memoize in the memoize package, version 0.8.1. We did not have the patience to compute the next number in the sequence.
0 VVX LLC DDM
1 I
-1 VVIX LLIC DDIM
2 II
-2 VVIIX IVVVX VVIXX LLIIC LLICC DDIIM DDIMM
3 III IIV
-3 VIVVX ILLVC IDDVM
4 IV
-4 VVVXI VVIVX LLVCI LLIVC LILVC DDVMI DDIVM DIDVM
5 V
-5 VVVX LLVC DDVM
6 VI
-6 VVVIX LLVIC DDVIM
7 VII
-7 VVVIIX VIVVVX LLVIIC ILLXCI ILLIXC ILILXC DDVIIM IDDXMI IDDIXM IDIDXM
8 IVV IIX
-8 ILLXC IDDXM
9 IX
-9 LLXCI LLIXC LILXC DDXMI DDIXM DIDXM
10 X
-10 LLXC DDXM
11 XI
-11 LLXIC DDXIM
12 XII
-12 VVVIXX LLXIIC LLVICC DDXIIM DDVIMM
13 VIVV XIII XIIV IXIV IIXV
-13 ILLXVC IDDXVM
14 XIV IXV
-14 LLXVCI LLXIVC LLIXVC LILXVC DDXVMI DDXIVM DDIXVM DIDXVM
15 XV
-15 LLXVC DDXVM
16 XVI
-16 LLXVIC ILLXCC DDXVIM IDDXMM
17 XVII IXIX
-17 XILLLC
18 IXX
-18 LLIXXC ILLXXC XLILLC LLIXCC LILXCC DDIXXM IDDXXM DDIXMM DIDXMM
19 XIX
-19 LLXXCI LLXIXC LILXXC XLLLCI XLLILC LLXCCI DDXXMI DDXIXM DIDXXM DDXMMI
20 XX
-20 LLXXC XLLLC LLXCC DDXXM DDXMM
21 XXI
-21 LLXXIC XLLLIC LLXCIC DDXXIM DDXMIM
22 XXII
-22 LLXICC DDXIMM
23 IXXV
-23 IXLLLC
24 XXIV XIXV
-24 LLXXVCI LLXXIVC LLXIXVC LILXXVC VLVLLCI VLVLILC IXLLLIC LXLLCVI LXLVLCI LXLVILC LXLILCV XLLLVCI XLLLIVC XLLILVC VLLLCXI VLLILCX LLXIICC LLXCVCI LLXCIVC VLLCLCI VLLCILC ILLXCCC DDXXVMI DDXXIVM DDXIXVM DIDXXVM XDDLMVI XDDVLMI XDDVILM XDDILMV XDVDLMI XDVDILM XDVIDLM XDIDLMV XDIDVLM DDXIIMM DDXMVMI DDXMIVM IDDXMMM
25 XXV
-25 LLXXVC VLVLLC LXLLCV LXLVLC XLLLVC VLLLCX LLXCVC VLLCLC DDXXVM XDDLMV XDDVLM XDVDLM DDXMVM
26 XXVI
-26 LLXXVIC VLVLLIC VILVLLC IVVLLLC LXIILLC IIXLLLC LXLLCIV LXLLICV LXLVLIC LXLIVLC XLLLVIC VLLLCIX VLLLICX ILXLLCV ILXLVLC LLXCVIC ILLXVCC ILLXCXC VLLCLIC DDXXVIM XIIDDLM XDDLMIV XDDLIMV XDDVLIM XDDIVLM XDVDLIM XDIVDLM DDXMVIM IDDXVMM IDDXMXM VIDDMLM
27 IXXX
-27 LLIXXXC VLIVLLC VILLLCV LXILLCI LXILILC LIXLLCV LIXLVLC XILLLXC ILLLCXX LLIXCXC LLIXCCC LILXCCC DDIXXXM XIDDLMI XIDDILM XIDIDLM IXDDLMV IXDDVLM IXDVDLM DDIXMXM DDIXMMM DIDXMMM
28 XIXX
-28 LXILLC XIDDLM
29 XXIX XXIL
-29 LXLLCI LXLILC XDDLMI XDDILM XDIDLM
30 XXX XXL
-30 LXLLC XDDLM
31 XXXI XXLI XIXL
-31 LXLLIC ILXLLC XDDLIM
32 IXXL
-32 VILLLC LIXLLC IXDDLM
33 XIXXV XVIIL IXXLI IXIXL
-33 VLILLC
34 XVIL
-34 VLLLCI VLLILC
35 XVL
-35 VLLLC
36 XVLI XIVL IXVL
-36 VLLLIC
37 XIXXX VIVVL XIIIL XVLII XIVLI XIIVL IXVLI IXIVL IIXVL IXXLV
-37 ILLLCX
38 XIIL
-38 LVILLC IVLLLC LILLCX VIDDLM IDDLMX IDDXLM
39 XIL
-39 LVLLCI LVLILC LLLCXI LLXLCI LLXILC LLILCX ILLXLC VDDLMI VDDILM VDIDLM DDLMXI DDXLMI DDXILM DDILMX DXDLMI DXDILM DXIDLM DIDLMX DIDXLM
40 XL
-40 LVLLC LLLCX LLXLC VDDLM DDLMX DDXLM DXDLM
41 XLI IXL
-41 LVLLIC ILVLLC LLLCIX LLLICX LLXLIC LLIXLC VDDLIM DDLMIX DDLIMX DDXLIM DDIXLM DXDLIM DIXDLM
42 IVVL XLII IXLI IIXL
-42 LIVLLC ILLLCV IVDDLM
43 VIIL
-43 LILLCV IDDLMV IDDVLM
44 VIL
-44 IILLLC LLLCVI LLVLCI LLVILC LLILCV ILLVLC DDLMVI DDVLMI DDVILM DDILMV DVDLMI DVDILM DVIDLM DIDLMV DIDVLM
45 VL
-45 LLLCV LLVLC DDLMV DDVLM DVDLM
46 VLI IVL
-46 LIILLC ILLLCI ILLILC LLLCIV LLLICV LLVLIC LLIVLC IIDDLM DDLMIV DDLIMV DDVLIM DDIVLM DVDLIM DIVDLM IDDMLM
47 IIIL VLII IVLI IIVL
-47 ILLLC
48 IIL
-48 LILLC IDDLM
49 IL
-49 LLLCI LLILC DDLMI DDILM DIDLM
50 L
-50 LLLC DDLM
51 LI
-51 LLLIC DDLIM
52 LII
-52 LLLIIC ILLLVC DDLIIM
53 LIII LIIV ILIV IILV
-53 LILLVC IDDLVM
54 LIV ILV
-54 LLLVCI LLLIVC LLILVC DDLVMI DDLIVM DDILVM DIDLVM
55 LV
-55 LLLVC DDLVM
56 LVI
-56 LLLVIC DDLVIM
57 LVII
-57 ILLLXC
58 LIVV LIIX ILIX IILX
-58 LILLXC IDDLXM
59 LIX ILX
-59 LLLXCI LLLIXC LLILXC DDLXMI DDLIXM DDILXM DIDLXM
60 LX
-60 LLLXC DDLXM
61 LXI
-61 LLLXIC DDLXIM
62 LXII
-62 LLLXIIC ILLLXVC LXLLICC ILXLLCC DDLXIIM XIXDDCM XDDLIMM
63 LVIVV LXIII LXIIV LIXIV LIIXV ILXIV ILIXV IILXV VLIXX
-63 LILLXVC IDDLXVM
64 LXIV LIXV ILXV
-64 LLLXVCI LLLXIVC LLLIXVC LLILXVC IXLLLLC VILLLCC LIXLLCC DDLXVMI DDLXIVM DDLIXVM DDILXVM DIDLXVM IXXDDCM IXDDLMM IXDDMCM
65 LXV
-65 LLLXVC DDLXVM
66 LXVI
-66 LLLXVIC VLILLCC DDLXVIM
67 LXVII LIXIX ILIXX
-67 ILLLXXC LXILLLC
68 LIXX
-68 LLLIXXC LILLXXC LXLILLC VLLILCC DDLIXXM IDDLXXM XVIDDCM XIDDCMX XIDDXCM
69 LXIX ILXX
-69 LLLXXCI LLLXIXC LLILXXC LXLLLCI LXLLILC VLLLCCI DDLXXMI DDLXIXM DDILXXM DIDLXXM XVDDCMI XVDDICM XVDIDCM XDDCMXI XDDXCMI XDDXICM XDDICMX XDXDCMI XDXDICM XDXIDCM XDIDCMX XDIDXCM
70 LXX
-70 LLLXXC LXLLLC VLLLCC DDLXXM XVDDCM XDDCMX XDDXCM XDXDCM
71 LXXI
-71 LLLXXIC LXLLLIC VLLLCIC DDLXXIM XVDDCIM XDDCMIX XDDCIMX XDDXCIM XDDIXCM XDXDCIM XDIXDCM
72 LXXII XIVLL IXVLL XIXXC
-72 ILXLLLC VLLLICC XIVDDCM IXVDDCM IXDDCMX IXDDXCM
73 LIXXV IXLXL IXXXC
-73 LIXLLLC XIDDCMV XIDDVCM IXDXDCM
74 LXXIV LXIXV ILXXV XLVIL XXVIC
-74 XDDCMVI XDDVCMI XDDVICM XDDICMV XDVDCMI XDVDICM XDVIDCM XDIDCMV XDIDVCM
75 LXXV XLVL XXVC
-75 XDDCMV XDDVCM XDVDCM
76 LXXVI XIILL XLVLI XLIVL XXVCI XXIVC XIXVC
-76 VILLLLC LVILLCC IVLLLCC XIIDDCM XDDCMIV XDDCIMV XDDVCIM XDDIVCM XDVDCIM XDIVDCM VIDDLMM IDDXLMM VIDDMCM
77 LIXXX XILIL IXLVL IXXVC
-77 VLILLLC XIDDCMI XIDDICM XIDIDCM IXDDCMV IXDDVCM IXDVDCM
78 XILL
-78 XIDDCM
79 XLIL XXIC
-79 XDDCMI XDDICM XDIDCM
80 XLL XXC
-80 XDDCM
81 XLLI XXCI XIXC
-81 XDDCIM
82 IXLL IXXC
-82 IXDDCM
83 IXLLI XILLV XVIIC IXXCI IXIXC
-83 LVLILLC VIDDCMV VIDDVCM IXDDCIM XIDDCVM IDDCMXV IDDXCMV IDDXVCM IDDVCMX
84 XVIC
-84 LVLLLCI LVLLILC IVLLLLC LIVLLCC ILLLCCX VDDCMVI VDDVCMI VDDVICM VDDICMV VDVDCMI VDVDICM VDVIDCM VDIDCMV VDIDVCM IVVDDCM IIXDDCM DDCMXVI DDXCMVI DDXVCMI DDXVICM DDXICMV DDVCMXI DDVICMX DDICMXV DXDCMVI DXDVCMI DXDVICM DXDICMV DXVDCMI DXVDICM DXVIDCM DXIDCMV DXIDVCM DVDCMXI DVDXCMI DVDXICM DVDICMX DVIDCMX DVIDXCM DIDCMXV DIDXCMV DIDXVCM DIDVCMX XDDCVMI XDDCIVM XDDICVM XDIDCVM IVDDLMM IVDDMCM
85 XVC
-85 LVLLLC VDDCMV VDDVCM VDVDCM DDCMXV DDXCMV DDXVCM DDVCMX DXDCMV DXDVCM DXVDCM DVDCMX DVDXCM XDDCVM
86 XVCI XIVC IXVC
-86 LVLLLIC ILLLLCX LILLCCX VIIDDCM VDDCMIV VDDCIMV VDDVCIM VDDIVCM VDVDCIM VDIVDCM IIDDCMX IIDDXCM DDCMXIV DDCMIXV DDCIMXV DDXCMIV DDXCIMV DDXVCIM DDXIVCM DDVCMIX DDVCIMX DDIXCMV DDIXVCM DDIVCMX DXDCMIV DXDCIMV DXDVCIM DXDIVCM DXVDCIM DXIVDCM DVDCMIX DVDCIMX DVDXCIM DVDIXCM DIXDCMV DIXDVCM DIXVDCM DIVDCMX DIVDXCM XDDCVIM IDDVLMM IDDLMMX IDDMCMX IDDMXCM
87 VILIL IXLLV VIVVC XIIIC XVCII XIVCI XIIVC IXVCI IXIVC IIXVC IXXCV
-87 ILVLLLC LILLLCX VIDDCMI VIDDICM VIDIDCM IVDDCMV IVDDVCM IDDCMXI IDDXCMI IDDXICM IDDICMX IDIDCMX IDIDXCM IXDDCVM
88 VILL XIIC
-88 VIDDCM IDDCMX IDDXCM
89 XIC
-89 VDDCMI VDDICM VDIDCM DDCMXI DDXCMI DDXICM DDICMX DXDCMI DXDICM DXIDCM DIDCMX DIDXCM
90 XC
-90 VDDCM DDCMX DDXCM DXDCM
91 XCI IXC
-91 VDDCIM DDCMIX DDCIMX DDXCIM DDIXCM DXDCIM DIXDCM
92 IVLL IVVC XCII IXCI IIXC
-92 IVDDCM
93 VIIC
-93 IDDCMV IDDVCM
94 VIC
-94 ILLLCC DDCMVI DDVCMI DDVICM DDICMV DVDCMI DVDICM DVIDCM DIDCMV DIDVCM
95 VC
-95 DDCMV DDVCM DVDCM
96 VCI IVC
-96 ILLLLC LILLCC IIDDCM DDCMIV DDCIMV DDVCIM DDIVCM DVDCIM DIVDCM IDDLMM IDDMCM
97 ILIL IIIC VCII IVCI IIVC
-97 LILLLC IDDCMI IDDICM IDIDCM
98 ILL IIC
-98 IDDCM
99 IC
-99 DDCMI DDICM DIDCM
100 C
-100 DDCM
Wow, the coding style is on par with terrible roman numbers :)
ReplyDeleteWhy do you avoid using any indentation at all? It makes complex code even harder to read.