Sunday, September 20, 2020

[bvpqwmoh] Making Roman numerals worse

Roman numerals are terrible, but we explore making them even worse.

Extend additive notation the obvious way: iii=3, iiii=4, iiiii=5, iiiiii=6, vv=10, vvv=15.

Extend subtractive notation in the following (terrible) way.  We describe it around the character D (500), but the full system is its generalization to all Roman numeral characters.

First, consider uD, where u is a Roman numeral string composed of one or more of the characters I V X L C (i.e., anything less than D).  Interpret u (recursively) as a number.  Then, uD is 500-value(u).

ix = 9.  iix = 8.  vix = 10 - 6 = 4.

Incidentally, iiiiiv becomes one way to express zero.  So is vvx.  We can also create negative numbers: vvix = -(5 + 5 + 1) + 10 = -1.

Next, consider uDvDwDx.  u v w x are each strings composed of characters less than D.  Evaluate as follows: value(uDvDwDx) = (500-value(u)) + (500-value(uv)) + (500+value(uvw)) + value(x).  uv and uvw are string concatenation, so subtractive prefixes can get used multiple times.  It feels a bit like earlier strings, e.g., u, distribute over later strings v and w.

idid = id + iid = 499 + 498 = 997.  diid = 500 + 498 = 998.  iidd = iid + iid = 498 + 498 = 996.  idvd = id + ivd = 499 + 496 = 995.  vdidxd = vd + vid + vixd = 495 + (500-value(vi)) + (500-value(vix)) = 495 + 494 + 496 = 1485.

xixixixixixixixixixix = 10 + 9 + 8 + 7 + 6 + 5 + 4 + 3 + 2 + 1 + 0 = 55.

Subtractive prefixes get used multiple times through concatenation only through one level of characters:

iv = 4.  ivx = 10-value(iv) = 6, not anything bizarre like value(ix)-value(iv).  ixvx = (10-value(i)) + (10-value(iv)) = 9 + 6 = 15.

This system is backward compatible with existing Roman numerals.  It assigns values to strings which were previously invalid, and no previously valid strings change their value.  Every string of Roman numeral characters now has a unique value.  There are many ways to express any given value.

Here is Haskell source code for evaluating these "worse" Roman numerals.  We also provide routines for finding the shortest representation of a given Arabic number.  The shortest representation is found by breadth-first search.  At the end of this post, we give the shortest representation of all numbers from -100 to 100.  Some numbers have multiple possible shortest representations: we give them in a decreasing aesthetic order described in the compareroman function.

Future work: assume the minus sign is available for negation.  For what negative numbers (if any) does the shortest representation not use it?  Are there any positive numbers whose shortest representation does use it?  (Both of these seem unlikely.)  Another possibility: invent a character for zero, perhaps N for nihil or nullus.  Consider expressing negative numbers with subtractive prefixes in front of N.

Future work: instead of the standard set of Roman numeral characters and their values, consider some different set of values for characters: powers of 2, Fibonacci numbers, square numbers.  Does anything interesting happen?  Perhaps interesting things happen when seeking the shortest representation of numbers.

We define a sequence "worstcase" that exemplifies the "worst case" of our worse Roman numeral system.  Each line below has the next largest Roman numeral character interleaved between the characters of the previous line, and one more at the end.

1 I
4 IV
15 IXVX
161 ILXLVLXL
83 ICLCXCLCVCLCXCLC
5349 IDCDLDCDXDCDLDCDVDCDLDCDXDCDLDCD
-57745 IMDMCMDMLMDMCMDMXMDMCMDMLMDMCMDMVMDMCMDMLMDMCMDMXMDMCMDMLMDMCMDM

With Roman numeral characters beyond M, the sequence would continue

1304801 -42453397 2336054109 -221579717657 36896704797401 -10904184517859485 5768308016008033877 -5503513512222683409697

Getting the last 4 values required memoization in order not to run out of memory.  We used Data.MemoTrie.memoFix from the MemoTrie package, version 0.6.9.  It had higher performance than Data.Function.Memoize in the memoize package, version 0.8.1.  We did not have the patience to compute the next number in the sequence.


0 VVX LLC DDM

1 I

-1 VVIX LLIC DDIM

2 II

-2 VVIIX IVVVX VVIXX LLIIC LLICC DDIIM DDIMM

3 III IIV

-3 VIVVX ILLVC IDDVM

4 IV

-4 VVVXI VVIVX LLVCI LLIVC LILVC DDVMI DDIVM DIDVM

5 V

-5 VVVX LLVC DDVM

6 VI

-6 VVVIX LLVIC DDVIM

7 VII

-7 VVVIIX VIVVVX LLVIIC ILLXCI ILLIXC ILILXC DDVIIM IDDXMI IDDIXM IDIDXM

8 IVV IIX

-8 ILLXC IDDXM

9 IX

-9 LLXCI LLIXC LILXC DDXMI DDIXM DIDXM

10 X

-10 LLXC DDXM

11 XI

-11 LLXIC DDXIM

12 XII

-12 VVVIXX LLXIIC LLVICC DDXIIM DDVIMM

13 VIVV XIII XIIV IXIV IIXV

-13 ILLXVC IDDXVM

14 XIV IXV

-14 LLXVCI LLXIVC LLIXVC LILXVC DDXVMI DDXIVM DDIXVM DIDXVM

15 XV

-15 LLXVC DDXVM

16 XVI

-16 LLXVIC ILLXCC DDXVIM IDDXMM

17 XVII IXIX

-17 XILLLC

18 IXX

-18 LLIXXC ILLXXC XLILLC LLIXCC LILXCC DDIXXM IDDXXM DDIXMM DIDXMM

19 XIX

-19 LLXXCI LLXIXC LILXXC XLLLCI XLLILC LLXCCI DDXXMI DDXIXM DIDXXM DDXMMI

20 XX

-20 LLXXC XLLLC LLXCC DDXXM DDXMM

21 XXI

-21 LLXXIC XLLLIC LLXCIC DDXXIM DDXMIM

22 XXII

-22 LLXICC DDXIMM

23 IXXV

-23 IXLLLC

24 XXIV XIXV

-24 LLXXVCI LLXXIVC LLXIXVC LILXXVC VLVLLCI VLVLILC IXLLLIC LXLLCVI LXLVLCI LXLVILC LXLILCV XLLLVCI XLLLIVC XLLILVC VLLLCXI VLLILCX LLXIICC LLXCVCI LLXCIVC VLLCLCI VLLCILC ILLXCCC DDXXVMI DDXXIVM DDXIXVM DIDXXVM XDDLMVI XDDVLMI XDDVILM XDDILMV XDVDLMI XDVDILM XDVIDLM XDIDLMV XDIDVLM DDXIIMM DDXMVMI DDXMIVM IDDXMMM

25 XXV

-25 LLXXVC VLVLLC LXLLCV LXLVLC XLLLVC VLLLCX LLXCVC VLLCLC DDXXVM XDDLMV XDDVLM XDVDLM DDXMVM

26 XXVI

-26 LLXXVIC VLVLLIC VILVLLC IVVLLLC LXIILLC IIXLLLC LXLLCIV LXLLICV LXLVLIC LXLIVLC XLLLVIC VLLLCIX VLLLICX ILXLLCV ILXLVLC LLXCVIC ILLXVCC ILLXCXC VLLCLIC DDXXVIM XIIDDLM XDDLMIV XDDLIMV XDDVLIM XDDIVLM XDVDLIM XDIVDLM DDXMVIM IDDXVMM IDDXMXM VIDDMLM

27 IXXX

-27 LLIXXXC VLIVLLC VILLLCV LXILLCI LXILILC LIXLLCV LIXLVLC XILLLXC ILLLCXX LLIXCXC LLIXCCC LILXCCC DDIXXXM XIDDLMI XIDDILM XIDIDLM IXDDLMV IXDDVLM IXDVDLM DDIXMXM DDIXMMM DIDXMMM

28 XIXX

-28 LXILLC XIDDLM

29 XXIX XXIL

-29 LXLLCI LXLILC XDDLMI XDDILM XDIDLM

30 XXX XXL

-30 LXLLC XDDLM

31 XXXI XXLI XIXL

-31 LXLLIC ILXLLC XDDLIM

32 IXXL

-32 VILLLC LIXLLC IXDDLM

33 XIXXV XVIIL IXXLI IXIXL

-33 VLILLC

34 XVIL

-34 VLLLCI VLLILC

35 XVL

-35 VLLLC

36 XVLI XIVL IXVL

-36 VLLLIC

37 XIXXX VIVVL XIIIL XVLII XIVLI XIIVL IXVLI IXIVL IIXVL IXXLV

-37 ILLLCX

38 XIIL

-38 LVILLC IVLLLC LILLCX VIDDLM IDDLMX IDDXLM

39 XIL

-39 LVLLCI LVLILC LLLCXI LLXLCI LLXILC LLILCX ILLXLC VDDLMI VDDILM VDIDLM DDLMXI DDXLMI DDXILM DDILMX DXDLMI DXDILM DXIDLM DIDLMX DIDXLM

40 XL

-40 LVLLC LLLCX LLXLC VDDLM DDLMX DDXLM DXDLM

41 XLI IXL

-41 LVLLIC ILVLLC LLLCIX LLLICX LLXLIC LLIXLC VDDLIM DDLMIX DDLIMX DDXLIM DDIXLM DXDLIM DIXDLM

42 IVVL XLII IXLI IIXL

-42 LIVLLC ILLLCV IVDDLM

43 VIIL

-43 LILLCV IDDLMV IDDVLM

44 VIL

-44 IILLLC LLLCVI LLVLCI LLVILC LLILCV ILLVLC DDLMVI DDVLMI DDVILM DDILMV DVDLMI DVDILM DVIDLM DIDLMV DIDVLM

45 VL

-45 LLLCV LLVLC DDLMV DDVLM DVDLM

46 VLI IVL

-46 LIILLC ILLLCI ILLILC LLLCIV LLLICV LLVLIC LLIVLC IIDDLM DDLMIV DDLIMV DDVLIM DDIVLM DVDLIM DIVDLM IDDMLM

47 IIIL VLII IVLI IIVL

-47 ILLLC

48 IIL

-48 LILLC IDDLM

49 IL

-49 LLLCI LLILC DDLMI DDILM DIDLM

50 L

-50 LLLC DDLM

51 LI

-51 LLLIC DDLIM

52 LII

-52 LLLIIC ILLLVC DDLIIM

53 LIII LIIV ILIV IILV

-53 LILLVC IDDLVM

54 LIV ILV

-54 LLLVCI LLLIVC LLILVC DDLVMI DDLIVM DDILVM DIDLVM

55 LV

-55 LLLVC DDLVM

56 LVI

-56 LLLVIC DDLVIM

57 LVII

-57 ILLLXC

58 LIVV LIIX ILIX IILX

-58 LILLXC IDDLXM

59 LIX ILX

-59 LLLXCI LLLIXC LLILXC DDLXMI DDLIXM DDILXM DIDLXM

60 LX

-60 LLLXC DDLXM

61 LXI

-61 LLLXIC DDLXIM

62 LXII

-62 LLLXIIC ILLLXVC LXLLICC ILXLLCC DDLXIIM XIXDDCM XDDLIMM

63 LVIVV LXIII LXIIV LIXIV LIIXV ILXIV ILIXV IILXV VLIXX

-63 LILLXVC IDDLXVM

64 LXIV LIXV ILXV

-64 LLLXVCI LLLXIVC LLLIXVC LLILXVC IXLLLLC VILLLCC LIXLLCC DDLXVMI DDLXIVM DDLIXVM DDILXVM DIDLXVM IXXDDCM IXDDLMM IXDDMCM

65 LXV

-65 LLLXVC DDLXVM

66 LXVI

-66 LLLXVIC VLILLCC DDLXVIM

67 LXVII LIXIX ILIXX

-67 ILLLXXC LXILLLC

68 LIXX

-68 LLLIXXC LILLXXC LXLILLC VLLILCC DDLIXXM IDDLXXM XVIDDCM XIDDCMX XIDDXCM

69 LXIX ILXX

-69 LLLXXCI LLLXIXC LLILXXC LXLLLCI LXLLILC VLLLCCI DDLXXMI DDLXIXM DDILXXM DIDLXXM XVDDCMI XVDDICM XVDIDCM XDDCMXI XDDXCMI XDDXICM XDDICMX XDXDCMI XDXDICM XDXIDCM XDIDCMX XDIDXCM

70 LXX

-70 LLLXXC LXLLLC VLLLCC DDLXXM XVDDCM XDDCMX XDDXCM XDXDCM

71 LXXI

-71 LLLXXIC LXLLLIC VLLLCIC DDLXXIM XVDDCIM XDDCMIX XDDCIMX XDDXCIM XDDIXCM XDXDCIM XDIXDCM

72 LXXII XIVLL IXVLL XIXXC

-72 ILXLLLC VLLLICC XIVDDCM IXVDDCM IXDDCMX IXDDXCM

73 LIXXV IXLXL IXXXC

-73 LIXLLLC XIDDCMV XIDDVCM IXDXDCM

74 LXXIV LXIXV ILXXV XLVIL XXVIC

-74 XDDCMVI XDDVCMI XDDVICM XDDICMV XDVDCMI XDVDICM XDVIDCM XDIDCMV XDIDVCM

75 LXXV XLVL XXVC

-75 XDDCMV XDDVCM XDVDCM

76 LXXVI XIILL XLVLI XLIVL XXVCI XXIVC XIXVC

-76 VILLLLC LVILLCC IVLLLCC XIIDDCM XDDCMIV XDDCIMV XDDVCIM XDDIVCM XDVDCIM XDIVDCM VIDDLMM IDDXLMM VIDDMCM

77 LIXXX XILIL IXLVL IXXVC

-77 VLILLLC XIDDCMI XIDDICM XIDIDCM IXDDCMV IXDDVCM IXDVDCM

78 XILL

-78 XIDDCM

79 XLIL XXIC

-79 XDDCMI XDDICM XDIDCM

80 XLL XXC

-80 XDDCM

81 XLLI XXCI XIXC

-81 XDDCIM

82 IXLL IXXC

-82 IXDDCM

83 IXLLI XILLV XVIIC IXXCI IXIXC

-83 LVLILLC VIDDCMV VIDDVCM IXDDCIM XIDDCVM IDDCMXV IDDXCMV IDDXVCM IDDVCMX

84 XVIC

-84 LVLLLCI LVLLILC IVLLLLC LIVLLCC ILLLCCX VDDCMVI VDDVCMI VDDVICM VDDICMV VDVDCMI VDVDICM VDVIDCM VDIDCMV VDIDVCM IVVDDCM IIXDDCM DDCMXVI DDXCMVI DDXVCMI DDXVICM DDXICMV DDVCMXI DDVICMX DDICMXV DXDCMVI DXDVCMI DXDVICM DXDICMV DXVDCMI DXVDICM DXVIDCM DXIDCMV DXIDVCM DVDCMXI DVDXCMI DVDXICM DVDICMX DVIDCMX DVIDXCM DIDCMXV DIDXCMV DIDXVCM DIDVCMX XDDCVMI XDDCIVM XDDICVM XDIDCVM IVDDLMM IVDDMCM

85 XVC

-85 LVLLLC VDDCMV VDDVCM VDVDCM DDCMXV DDXCMV DDXVCM DDVCMX DXDCMV DXDVCM DXVDCM DVDCMX DVDXCM XDDCVM

86 XVCI XIVC IXVC

-86 LVLLLIC ILLLLCX LILLCCX VIIDDCM VDDCMIV VDDCIMV VDDVCIM VDDIVCM VDVDCIM VDIVDCM IIDDCMX IIDDXCM DDCMXIV DDCMIXV DDCIMXV DDXCMIV DDXCIMV DDXVCIM DDXIVCM DDVCMIX DDVCIMX DDIXCMV DDIXVCM DDIVCMX DXDCMIV DXDCIMV DXDVCIM DXDIVCM DXVDCIM DXIVDCM DVDCMIX DVDCIMX DVDXCIM DVDIXCM DIXDCMV DIXDVCM DIXVDCM DIVDCMX DIVDXCM XDDCVIM IDDVLMM IDDLMMX IDDMCMX IDDMXCM

87 VILIL IXLLV VIVVC XIIIC XVCII XIVCI XIIVC IXVCI IXIVC IIXVC IXXCV

-87 ILVLLLC LILLLCX VIDDCMI VIDDICM VIDIDCM IVDDCMV IVDDVCM IDDCMXI IDDXCMI IDDXICM IDDICMX IDIDCMX IDIDXCM IXDDCVM

88 VILL XIIC

-88 VIDDCM IDDCMX IDDXCM

89 XIC

-89 VDDCMI VDDICM VDIDCM DDCMXI DDXCMI DDXICM DDICMX DXDCMI DXDICM DXIDCM DIDCMX DIDXCM

90 XC

-90 VDDCM DDCMX DDXCM DXDCM

91 XCI IXC

-91 VDDCIM DDCMIX DDCIMX DDXCIM DDIXCM DXDCIM DIXDCM

92 IVLL IVVC XCII IXCI IIXC

-92 IVDDCM

93 VIIC

-93 IDDCMV IDDVCM

94 VIC

-94 ILLLCC DDCMVI DDVCMI DDVICM DDICMV DVDCMI DVDICM DVIDCM DIDCMV DIDVCM

95 VC

-95 DDCMV DDVCM DVDCM

96 VCI IVC

-96 ILLLLC LILLCC IIDDCM DDCMIV DDCIMV DDVCIM DDIVCM DVDCIM DIVDCM IDDLMM IDDMCM

97 ILIL IIIC VCII IVCI IIVC

-97 LILLLC IDDCMI IDDICM IDIDCM

98 ILL IIC

-98 IDDCM

99 IC

-99 DDCMI DDICM DIDCM

100 C

-100 DDCM

1 comment :

Unknown said...

Wow, the coding style is on par with terrible roman numbers :)

Why do you avoid using any indentation at all? It makes complex code even harder to read.