Monday, January 08, 2018

[iblofees] Calendar Facts

Here is a machine readable version of xkcd #1930 "Calendar Facts", perhaps useful for followup projects like creating a random fact generator.  Further notes follow.

Sequence [Atom "Did you know that",Choice [Sequence [Atom "the",Choice [Sequence [Choice [Atom "fall",Atom "spring"],Atom "equinox"],Sequence [Choice [Atom "winter",Atom "summer"],Choice [Atom "solstice",Atom "Olympics"]],Sequence [Choice [Atom "earliest",Atom "latest"],Choice [Atom "sunrise",Atom "sunset"]]]],Sequence [Atom "Daylight",Choice [Atom "Saving",Atom "Savings"],Atom "Time"],Sequence [Atom "leap",Choice [Atom "day",Atom "year"]],Atom "Easter",Sequence [Atom "the",Choice [Atom "Harvest",Atom "super",Atom "blood"],Atom "moon"],Atom "Toyota Truck Month",Atom "Shark Week"],Choice [Sequence [Atom "happens",Choice [Atom "earlier",Atom "later",Atom "at the wrong time"],Atom "every year"],Sequence [Atom "drifts out of sync with the",Choice [Atom "sun",Atom "moon",Atom "zodiac",Sequence [Choice [Atom "Gregorian",Atom "Mayan",Atom "lunar",Atom "iPhone"],Atom "calendar"],Atom "atomic clock in Colorado"]],Sequence [Atom "might",Choice [Atom "not happen",Atom "happen twice"],Atom "this year"]],Atom "because of",Choice [Sequence [Atom "time zone legislation in",Choice [Atom "Indiana",Atom "Arizona",Atom "Russia"]],Atom "a decree by the Pope in the 1500s",Sequence [Choice [Atom "precession",Atom "libration",Atom "nutation",Atom "libation",Atom "eccentricity",Atom "obliquity"],Atom "of the",Choice [Atom "moon",Atom "sun",Atom "earth's axis",Atom "equator",Atom "prime meridian",Sequence [Choice [Atom "International Date",Atom "Mason-Dixon"],Atom "line"]]],Atom "magnetic field reversal",Sequence [Atom "an arbitrary decision by",Choice [Atom "Benjamin Franklin",Atom "Isaac Newton",Atom "FDR"]]],Atom "? ",Atom "Apparently",Choice [Atom "it causes a predictable increase in car accidents",Atom "that's why we have leap seconds",Atom "scientists are really worried",Sequence [Atom "it was even more extreme during the",Choice [Sequence [Choice [Atom "Bronze",Atom "Ice"],Atom "Age"],Atom "Cretaceous",Atom "1990s"]],Sequence [Atom "there's a proposal to fix it, but it",Choice [Atom "will never happen",Atom "actually makes things worse",Atom "is stalled in Congress",Atom "might be unconstitutional"]],Atom "it's getting worse and no one knows why"],Atom ". ",Atom "While it may seem like trivia, it",Choice [Atom "causes huge headaches for software developers",Atom "is taken advantage of by high-speed traders",Atom "triggered the 2003 Northeast Blackout",Atom "has to be corrected for by GPS satellites",Atom "is now recognized as a major cause of World War I"],Atom "."]

Including the mouseover text, the grammar encodes 780,000 facts.

The above grammar is the output of "show" by a Haskell program where we typed the grammar slightly more compactly, using/abusing the OverloadedStrings language extension and a Num instance.  OverloadedStrings is a nice language extension for when we have a large number of literals in the source.  Excerpts of the full source code below:

{-# LANGUAGE OverloadedStrings #-}

data Grammar = Atom String | Choice [Grammar] | Sequence [Grammar] deriving (Show,Eq);

instance IsString Grammar where {
fromString = Atom;
};

instance Num Grammar where {
(+) x y = Choice [x,y];
(*) x y = Sequence [x,y];
abs = undefined;
signum = undefined;
fromInteger = undefined;
negate = undefined;
};

facts :: Grammar;
facts =
    "Did you know that"
    *(
      ("the"
       *((("fall"+"spring")*"equinox")
         +(("winter"+"summer")*("solstice"+"Olympics"))...

No comments :