Friday, April 27, 2007

Comments on GHC Random Number Generator

From the Glasgow Haskell Compiler 6.6 implementation of the module System.Random

instance Read StdGen where
  readsPrec _p = \ r ->
     case try_read r of
       r@[_] -> r
       _   -> [stdFromString r] -- because it shouldn't ever fail.
    where 
      try_read r = do

One can inject invalid state here (out of range) or 0 0.

         (s1, r1) <- readDec (dropWhile isSpace r)
         (s2, r2) <- readDec (dropWhile isSpace r1)
  return (StdGen s1 s2, r2)
{-
 If we cannot unravel the StdGen from a string, create
 one based on the string given.
-}

I wish this would use a real hash function, for example, SHA-256. It consumes only 6 characters! Since it has about 62 bits of state, it ought to in principle consume 8, though problems with overflow and the 3 multiplier might cause complications.

stdFromString         :: String -> (StdGen, String)
stdFromString s        = (mkStdGen num, rest)
 where (cs, rest) = splitAt 6 s
              num        = foldl (\a x -> x + 3 * a) 1 (map ord cs)
{- |
The function 'mkStdGen' provides an alternative way of producing an initial
generator, by mapping an 'Int' into a generator. Again, distinct arguments
should be likely to produce distinct generators.

Programmers may, of course, supply their own instances of 'RandomGen'.
-}
mkStdGen :: Int -> StdGen -- why not Integer ?

Yes, why not? Then the user could specify lots of initial entropy.

mkStdGen s
 | s < 0     = mkStdGen (-s)
 | otherwise = StdGen (s1+1) (s2+1)
      where
 (q, s1) = s `divMod` 2147483562
 s2      = q `mod` 2147483398

It should be commented that we use one less than the moduli to avoid the initial state being zero.

Look at all these hard coded numbers! Ideally they should be specified exactly once, in factored form, as in L'Ecuyer's paper. The factorization (showing no common factors between the moduli) demonstrates the long period.

Sadly, the function below is not exported.

createStdGen :: Integer -> StdGen
createStdGen s
 | s < 0     = createStdGen (-s)
 | otherwise = StdGen (fromInteger (s1+1)) (fromInteger (s2+1))
      where
 (q, s1) = s `divMod` 2147483562
 s2      = q `mod` 2147483398

-- FIXME: 1/2/3 below should be ** (vs@30082002) XXX

FIXME huh?

...


The lower 8 bits are used to create a random Char. (See the definition, especially the `mod` call in randomIvalInteger below.) The lower bits of RNG are suspect.

Even worse the lower 1 bits are used to create a random Bool.

instance Random Char where
  randomR (a,b) g = 
      case (randomIvalInteger (toInteger (ord a), toInteger (ord b)) g) of
        (x,g) -> (chr x, g)
  random g   = randomR (minBound,maxBound) g

instance Random Bool where
  randomR (a,b) g = 
      case (randomIvalInteger (toInteger (bool2Int a), toInteger (bool2Int b)) g) of
        (x, g) -> (int2Bool x, g)
       where
         bool2Int False = 0
         bool2Int True  = 1

  int2Bool 0 = False
  int2Bool _ = True

  random g   = randomR (minBound,maxBound) g
 

-- hah, so you thought you were saving cycles by using Float?

Yes, it sure would be nice. It would also be nice to avoid invoking GMP (via type Integer) when creating random numbers of the standard small types.

instance Random Float where
  random g        = randomIvalDouble (0::Double,1) realToFrac g
  randomR (a,b) g = randomIvalDouble (realToFrac a, realToFrac b) realToFrac g

mkStdRNG :: Integer -> IO StdGen
mkStdRNG o = do
    ct          <- getCPUTime
    (TOD sec _) <- getClockTime
    return (createStdGen (sec * 12345 + ct + o))

Is 12345 the optimal constant? Does it interact badly with the modulus 2147483562?

randomIvalInteger :: (RandomGen g, Num a) => (Integer, Integer) -> g -> (a, g)
randomIvalInteger (l,h) rng
 | l > h     = randomIvalInteger (h,l) rng
 | otherwise = case (f n 1 rng) of (v, rng') -> (fromInteger (l + v `mod` k), rng')

So iLogBase overshoots, then `mod` is used to bring it back into range. The wraparound overshoot makes it no longer uniform.

     where
       k = h - l + 1
       b = 2147483561

Shouldn't this be 2147483562? We are creating a big number in base 2147483562.

       n = iLogBase b k

       f 0 acc g = (acc, g)
       f n acc g = 
          let
    (x,g')   = next g

Might need to subtract 1 from x.

   in
   f (n-1) (fromIntegral x + acc * b) g'

randomIvalDouble :: (RandomGen g, Fractional a) => (Double, Double) -> (Double -> a) -> g -> (a, g)
randomIvalDouble (l,h) fromDouble rng 
  | l > h     = randomIvalDouble (h,l) fromDouble rng
  | otherwise =

Instead of choosing the pre-scaled range to be the same as the modulus (about 2 billion), we instead do from -2G to 2G, or 4 billion. This will require two calls to the RNG, then throwing away almost all the bits of the second call (which ought to be used for double-precision mantissa!)

       case (randomIvalInteger (toInteger (minBound::Int), toInteger (maxBound::Int)) rng) of
         (x, rng') -> 

Consumes only 32-ish bits of randomness to produce 52 bits of double-precision mantissa!

     let
      scaled_x = 
  fromDouble ((l+h)/2) +

The divide by 2 factor is because intRange goes from -2G to 2G.


                fromDouble ((h-l) / realToFrac intRange) *
  fromIntegral (x::Int)
     in
     (scaled_x, rng')

intRange returns approximately 4 * 109.

intRange :: Integer
intRange  = toInteger (maxBound::Int) - toInteger (minBound::Int)

This function iLogBase has quadratic running time.

iLogBase :: Integer -> Integer -> Integer
iLogBase b i = if i < b then 1 else 1 + iLogBase b (i `div` b)


stdRange :: (Int,Int)

The range is actually from 1.

stdRange = (0, 2147483562)

stdNext :: StdGen -> (Int, StdGen)
-- Returns values in the range stdRange
stdNext (StdGen s1 s2) = (z', StdGen s1'' s2'')
 where z'   = if z < 1 then z + 2147483562 else z

The value z' = 0 cannot be produced! The statistical independence between s1'' and s2'' is violated, so I'm not sure about the uniformity.

  z    = s1'' - s2''

  k    = s1 `quot` 53668
  s1'  = 40014 * (s1 - k * 53668) - k * 12211
  s1'' = if s1' < 0 then s1' + 2147483563 else s1'
    
  k'   = s2 `quot` 52774
  s2'  = 40692 * (s2 - k' * 52774) - k' * 3791
  s2'' = if s2' < 0 then s2' + 2147483399 else s2'

stdSplit            :: StdGen -> (StdGen, StdGen)
stdSplit std@(StdGen s1 s2)
                     = (left, right)
                       where
                        -- no statistical foundation for this!
                        left    = StdGen new_s1 t2
                        right   = StdGen t1 new_s2

                        new_s1 | s1 == 2147483562 = 1
                               | otherwise        = s1 + 1

                        new_s2 | s2 == 1          = 2147483398
                               | otherwise        = s2 - 1

                        StdGen t1 t2 = snd (next std)

Thursday, April 26, 2007

Colored programming IDE

Highlight every identifier by a color derived from (say) the hash of the identifier. This way a programmer might scan over code and possibly be able to sense a bug simply by looking at the pattern of colors. Two similar looking identifiers would likely have different colors.

Parentheses (and brackets, etc.) should be color coded to match each other.

Super Athletic Chess

One can combine World's Strongest Chessman with Condi Chess to create an awesome chess variant. It tests physical strength for moving the pieces, and stamina and endurance for running to the clock, memory for opening theory and endgame and of course general chess ability. The game design problem is to balance all the aspects, like the weight of the pieces and the distance to the board and the time on the clock so only a balanced competitor may win, or alternatively, that there are many different ways to win. If the board is obscured from view from the clock, and the players have to stay at the clock when not their turn, there is the additional challenge of having to visualize the position when thinking on the opponents' time.

One might have a clock on a conveyor belt which slowly recedes from the board to quadratically penalize slow play.

Tuesday, April 24, 2007

Space-filling polyhedra

Much has been done on space-filling polyhedron (other keywords: Dirichlet domains, Voronoi cells, stereohedra, crystallography) but it's not all collected well on the Internet. What is the complete list of single space-filling convex polyhedra modulo simple deformations? Are they all known? Space filling isohedra? Space-filling polyhedra with all edge lengths equal?

A toy I would like to see is the Kelvin cell with each face magnetic to build structures with. I still cannot imagine how a shape with curved faces can fill space, or how a shape can have curved edges but flat faces. (Which is it?)

Monday, April 23, 2007

Factored Prime Gap

What is the largest prime gap for which every composite in between the two primes is fully factored? Has at least one known factor?

Saturday, April 21, 2007

Pitching ERA

Pitcher's ERA is not the best way to measure performance. A pitcher with lots of run support need not be so careful. So, the first interesting statistic would be ERA minus average run support. The second takes into account variance of the two statistics and estimates the probability that the difference is positive. The third takes into account the opposing pitcher's ability, estimating the expected run support for each particular game.

They say solo home runs don't hurt you very much. What happens if you subtract solo HRs from ERA? Who has the largest and smallest percentage of ERA as solo HRs? Ratio of solo vs. total HR?

Friday, April 20, 2007

Gomoku go

(囲碁) What happens if the winner is the first to create a living group? Side, corner, two eyes, or other life?

Monday, April 09, 2007

Bank LiveCD

Banks etc could give out a LiveCD for secure online access. Or make your own. Hardwire just one website, to prevent phishing.

Even more securely, make the CD a unique token.

Need OS support for not-inconvenient suspend and rapid Live CD booting.

1029

1029 = 44 · 4 + 4 + 1 = 73 · 3

Off by one

for(a=1,50,print(a," ",factor(a^a*a+a+1)))

1 Mat([3, 1])
2 Mat([11, 1])
3 [5, 1; 17, 1]
4 [3, 1; 7, 3]
5 7, 2; 11, 1; ...
6 271, 1; ...
7 3, 1; 19, 2; ...
8 17, 1; 1361, 1; ...
9 11, 1; 19, 1; ...
10 3, 1; 37, 2; ...
11 13, 1; 36529, 1; ...
12 5, 1; 71, 1; ...
13 3, 1; 19, 1; 61, 2; 461, 1; 2713, 1; ...
14 277, 1; ...
15 17, 1; 63857, 1; ...
16 3, 1; 7, 2; 13, 2; ...
17 11, 1; 19, 1; ...
18 201000229, 1; ...
19 3, 1; 127, 2; 167, 1; 130841, 1; 936283, 1; ...
20 41, 1; 53549, 1; 244813103, 1; ...
21 19, 1; 23, 1; 43, 1; 461, 1; ...
22 3, 1; 13, 4; 19, 2; 239, 1; 3019, 1; 28439, 1; ...
23 5, 1; ...
24 7, 1; 31, 1; 367, 1; 3697, 1; 4019, 1; 4421, 1; ...
25 3, 1; 7, 3; 31, 2; 47, 1; 283, 1; 164485232117, 1; ...
26 1151, 1; 1949, 1; 2420321, 1; 2994661, 1; 192604459, 1; ...
27 29, 1; 2087, 1; 20693, 1; 127703, 1; 1496783, 1; ...
28 3, 1; 23, 1; 271, 2; 7349, 1; 7031023, 1; 16386187, 1; ...
29 31, 1; 59, 1; 179, 1; 287356731133, 1; ...
30 23, 1; 251, 1; 416598973451, 1; 614391925556407, 1; ...
31 3, 1; 211, 1; 331, 2; 31883, 1; 71147, 1; 278867, 1; 21162014243, 1; ...
32 5, 1; 379, 1; 183022069, 1; ...
33 67, 1; ...
34 3, 1; 113, 1; 397, 2; 90620892826724569279, 1; ...
35 37, 1; 11497, 1; 1534957, 1; 9783807889, 1; 69375842508001, 1; ...
36 73, 1; 79, 1; 11813, 1; 4369097, 1; 22909204931549, 1; 169228131318583, 1; ...
37 3, 1; 7, 2; 23, 1; 67, 2; 179, 1; 17851, 1; ...
38 271, 1; 109398304199, 1; ...
39 41, 1; 641, 1; 551028832864379, 1; ...
40 3, 1; 23, 1; 547, 2; 163193, 1; 103821313, 1; ...
41 43, 1; 83, 1; 249341, 1; 292541, 1; ...
42 7673, 1; ...
43 3, 1; 5, 1; 31, 1; 47, 1; 631, 2; 390050231523401657587, 1; ...
44 71, 1; 89, 1; 1677045346368068427777407257, 1; ...
45 47, 1; 9051947, 1; 491229703, 1; 14128833035515081731187, 1; ...
46 3, 1; 7, 3; 103, 2; 1439, 1; 6113, 1; 1417134681070517, 1; 1211221369572027119, 1; ...
47 7, 1; 208498453, 1; 62768966677, 1; ...
48 43, 1; 97, 1; 1031295286537, 1; ...
49 3, 1; 19, 2; 43, 2; 8737, 1; 153409, 1; 4161881, 1; 47557729, 1; 1231572632971, 1; 8245918385603, 1; ...
50 130657613, 1; 96872622227, 1; 218906875337, 1; 28924899119393713, 1; ...
51 53, 1; 2629867675577, 1; 569443060925776280407727439893, 1; ...
52 3, 1; 5, 1; 23, 1; 919, 2; ...
53 107, 1; 1999, 1; 737641, 1; 930412261043, 1; 94130302460041, 1; 3192743294266334761, 1; ...
54 96017, 1; 260419, 1; 8196173, 1; ...
55 3, 1; 13, 2; 79, 2; 643, 1; 971, 1; 7283, 1; 7457, 1; 76423, 1; 12409492464988175419299322668289387, 1; ...
56 113, 1; 241, 1; 353, 1; 953, 1; 469411, 1; 361014333137116391917, 1; 3628516781012996187539369, 1; ...
57 59, 1; 100386889, 1; 6321543675100916653, 1; 341493650649284541536291, 1; ...
58 3, 1; 7, 2; 163, 2; ...
59 53, 1; 61, 1; 349, 1; 9245158309, 1; 120544548211630908902423, 1; 78160655788202033012841388117831, 1; ...
60 11, 1; 43, 1; 47, 1; 199, 1; 1171, 1; 760719804161, 1; 1964835377513, 1; 1270629830046613, 1; ...
61 3, 1; 13, 2; 97, 2; 181, 1; 5947512368423, 1; 593123889824783, 1; 118344653044407041528153987, 1; ...
62 13355959, 1; 164496841, 1; 10292091945465437025733, 1; 74699533901627257475568411525705151, 1; ...
63 5, 2; 41, 1; 853, 1; 90174101, 1; 25627454210533711458438181, 1; 5919051349510792163062037098889, 1; ...
64 3, 1; 11, 1; 19, 2; 29, 1; 73, 2; 63223491756531089, 1; ...
65 67, 1; 131, 1; 75931, 1; 161406967771214788618904596583487255863, 1; ...
66 7, 1; 13229, 1; 175621, 1; 60705385096830613, 1; 7569220110914916487333209391763, 1; ...
67 3, 1; 7, 5; 31, 2; 166807, 1; 125620306831, 1; 12633727094391709793406313, 1; 1297214578001286301761672282562841, 1; ...
68 137, 1; 4457, 1; ...
69 71, 1; 97, 1; 139, 1; 32051, 1; ...
70 3, 1; 1657, 2; 23239339, 1; 3197199066319589985643186083781752161, 1; ...
71 73, 1; 8009, 1; 11753737, 1; 81534544530059862557486363, 1; 168289332683800594045130452406607652729, 1; ...
72 5, 3; 29987982540984257, 1; 1181241087252894645457638859178483, 1; ...
73 3, 1; 11, 2; 67, 1; 1801, 2; 65677, 1; 342044282612209, 1; ...
74?
75 311, 1; 557, 1; 1871, 1; 4357651, 1; 48515144351, 1; ...
76?
77 79, 1; 131, 1; 4703212510091, 1; 761643630403488538123655879, 1; ...
78?

Friday, April 06, 2007

mmmmm iiiii

How much info can you pack per inch for variable width font? The problem becomes even more interesting if one takes into account kerning.

Thursday, April 05, 2007

Polytope groups

All finite groups have been enumerated, as have all regular polytopes. What, in detail, is the mapping between them? Roughly, icosahedron = alternating group A5*Z2. What is the 24-cell? (Update: Answer)

Wednesday, April 04, 2007

Eternal Flame

Could one connect a bright red LED to some D-cells and have it last decades? A fuel cell?

Profile-directed parallel make

"make -j" should take a profile from a previous run estimating how long a job is expected to take to avoid bottlenecks and underutilization at the end.

15 char alphabet

The consonants can be paired phonetically: a bp cx dt e fv gkq hw i jy lr mn o sz u.

Monday, April 02, 2007

Base Delta Zero movie

Base Delta Zero would make an great movie. It would have awesome special effects full of mayhem and destruction -- for example, melting a city down to a pool of slag -- but could also have vignettes told from a touching human-interest perspective -- people trying to escape or survive but who all tragically fail. Set it inside the frame of a propaganda film -- the Empire meticulously filmed the operation so as to discourage other planets from misbehaving and incur the wrath depicted. It would be nice if some well-known Imperial character commanded the operation, but the movie could be produced entirely outside the franchise if rights cannot be secured.