From the Glasgow Haskell Compiler 6.6 implementation of the module System.Random
instance Read StdGen where readsPrec _p = \ r -> case try_read r of r@[_] -> r _ -> [stdFromString r] -- because it shouldn't ever fail. where try_read r = do
One can inject invalid state here (out of range) or 0 0.
(s1, r1) <- readDec (dropWhile isSpace r) (s2, r2) <- readDec (dropWhile isSpace r1) return (StdGen s1 s2, r2) {- If we cannot unravel the StdGen from a string, create one based on the string given. -}
I wish this would use a real hash function, for example, SHA-256. It consumes only 6 characters! Since it has about 62 bits of state, it ought to in principle consume 8, though problems with overflow and the 3 multiplier might cause complications.
stdFromString :: String -> (StdGen, String) stdFromString s = (mkStdGen num, rest) where (cs, rest) = splitAt 6 s num = foldl (\a x -> x + 3 * a) 1 (map ord cs) {- | The function 'mkStdGen' provides an alternative way of producing an initial generator, by mapping an 'Int' into a generator. Again, distinct arguments should be likely to produce distinct generators. Programmers may, of course, supply their own instances of 'RandomGen'. -} mkStdGen :: Int -> StdGen -- why not Integer ?
Yes, why not? Then the user could specify lots of initial entropy.
mkStdGen s | s < 0 = mkStdGen (-s) | otherwise = StdGen (s1+1) (s2+1) where (q, s1) = s `divMod` 2147483562 s2 = q `mod` 2147483398
It should be commented that we use one less than the moduli to avoid the initial state being zero.
Look at all these hard coded numbers! Ideally they should be specified exactly once, in factored form, as in L'Ecuyer's paper. The factorization (showing no common factors between the moduli) demonstrates the long period.
Sadly, the function below is not exported.
createStdGen :: Integer -> StdGen createStdGen s | s < 0 = createStdGen (-s) | otherwise = StdGen (fromInteger (s1+1)) (fromInteger (s2+1)) where (q, s1) = s `divMod` 2147483562 s2 = q `mod` 2147483398 -- FIXME: 1/2/3 below should be ** (vs@30082002) XXX
FIXME huh?
...
The lower 8 bits are used to create a random Char. (See the definition, especially the `mod` call in randomIvalInteger below.) The lower bits of RNG are suspect.
Even worse the lower 1 bits are used to create a random Bool.
instance Random Char where randomR (a,b) g = case (randomIvalInteger (toInteger (ord a), toInteger (ord b)) g) of (x,g) -> (chr x, g) random g = randomR (minBound,maxBound) g instance Random Bool where randomR (a,b) g = case (randomIvalInteger (toInteger (bool2Int a), toInteger (bool2Int b)) g) of (x, g) -> (int2Bool x, g) where bool2Int False = 0 bool2Int True = 1 int2Bool 0 = False int2Bool _ = True random g = randomR (minBound,maxBound) g -- hah, so you thought you were saving cycles by using Float?
Yes, it sure would be nice. It would also be nice to avoid invoking GMP (via type Integer) when creating random numbers of the standard small types.
instance Random Float where random g = randomIvalDouble (0::Double,1) realToFrac g randomR (a,b) g = randomIvalDouble (realToFrac a, realToFrac b) realToFrac g mkStdRNG :: Integer -> IO StdGen mkStdRNG o = do ct <- getCPUTime (TOD sec _) <- getClockTime return (createStdGen (sec * 12345 + ct + o))
Is 12345 the optimal constant? Does it interact badly with the modulus 2147483562?
randomIvalInteger :: (RandomGen g, Num a) => (Integer, Integer) -> g -> (a, g) randomIvalInteger (l,h) rng | l > h = randomIvalInteger (h,l) rng | otherwise = case (f n 1 rng) of (v, rng') -> (fromInteger (l + v `mod` k), rng')
So iLogBase overshoots, then `mod` is used to bring it back into range. The wraparound overshoot makes it no longer uniform.
where k = h - l + 1 b = 2147483561
Shouldn't this be 2147483562? We are creating a big number in base 2147483562.
n = iLogBase b k f 0 acc g = (acc, g) f n acc g = let (x,g') = next g
Might need to subtract 1 from x.
in f (n-1) (fromIntegral x + acc * b) g' randomIvalDouble :: (RandomGen g, Fractional a) => (Double, Double) -> (Double -> a) -> g -> (a, g) randomIvalDouble (l,h) fromDouble rng | l > h = randomIvalDouble (h,l) fromDouble rng | otherwise =
Instead of choosing the pre-scaled range to be the same as the modulus (about 2 billion), we instead do from -2G to 2G, or 4 billion. This will require two calls to the RNG, then throwing away almost all the bits of the second call (which ought to be used for double-precision mantissa!)
case (randomIvalInteger (toInteger (minBound::Int), toInteger (maxBound::Int)) rng) of (x, rng') ->
Consumes only 32-ish bits of randomness to produce 52 bits of double-precision mantissa!
let scaled_x = fromDouble ((l+h)/2) +
The divide by 2 factor is because intRange goes from -2G to 2G.
fromDouble ((h-l) / realToFrac intRange) * fromIntegral (x::Int) in (scaled_x, rng')
intRange returns approximately 4 * 109.
intRange :: Integer intRange = toInteger (maxBound::Int) - toInteger (minBound::Int)
This function iLogBase has quadratic running time.
iLogBase :: Integer -> Integer -> Integer iLogBase b i = if i < b then 1 else 1 + iLogBase b (i `div` b) stdRange :: (Int,Int)
The range is actually from 1.
stdRange = (0, 2147483562) stdNext :: StdGen -> (Int, StdGen) -- Returns values in the range stdRange stdNext (StdGen s1 s2) = (z', StdGen s1'' s2'') where z' = if z < 1 then z + 2147483562 else z
The value z' = 0 cannot be produced! The statistical independence between s1'' and s2'' is violated, so I'm not sure about the uniformity.
z = s1'' - s2'' k = s1 `quot` 53668 s1' = 40014 * (s1 - k * 53668) - k * 12211 s1'' = if s1' < 0 then s1' + 2147483563 else s1' k' = s2 `quot` 52774 s2' = 40692 * (s2 - k' * 52774) - k' * 3791 s2'' = if s2' < 0 then s2' + 2147483399 else s2' stdSplit :: StdGen -> (StdGen, StdGen) stdSplit std@(StdGen s1 s2) = (left, right) where -- no statistical foundation for this! left = StdGen new_s1 t2 right = StdGen t1 new_s2 new_s1 | s1 == 2147483562 = 1 | otherwise = s1 + 1 new_s2 | s2 == 1 = 2147483398 | otherwise = s2 - 1 StdGen t1 t2 = snd (next std)
No comments :
Post a Comment