如果haskell中的随机生成器是否均匀分布,我想用随机数来愚弄, 因此,我在几次尝试后编写了以下程序(生成列表导致堆栈溢出)。
module Main where
import System.Environment (getArgs)
import Control.Applicative ((<$>))
import System.Random (randomRIO)
main :: IO ()
main = do nn <- map read <$> getArgs :: IO [Int]
let n = if null nn then 10000 else head nn
m <- loop n 0 (randomRIO (0,1))
putStrLn $ "True: "++show (m//n::Double) ++", False "++show ((n-m)//n :: Double)
return ()
loop :: Int -> Int -> IO Double -> IO Int
loop n acc x | n<0 = return acc
| otherwise = do x' <- round <$> x
let x'' = (x' + acc) in x'' `seq` loop (n-1) x'' x
(//) :: (Integral a, Fractional b) => a -> a -> b
x // y = fromIntegral x / fromIntegral y
因为我得到了它的工作ok-ish我决定写另一个版本 - 在java(我不是很擅长),并期望haskell击败它,但java程序运行大约一半的时间相比haskell版本
import java.util.Random;
public class MonteCarlo {
public static void main(String[] args) {
int n = Integer.parseInt(args[0]);
Random r = new Random();
int acc = 0;
for (int i=0; i<=n; i++) {
acc += Math.round(r.nextDouble());
}
System.out.println("True: "+(double) acc/n+", False: "+(double)(n-acc)/n);
}
}
我试着查看haskell版本的配置文件 - 它告诉我大部分工作都是在循环中完成的 - 难怪! 我试着看看核心,但我真的不明白这一点。 我认为java版本可能使用多个核心 - 因为系统使用时间超过100%,当我计时时。
我想可以使用未装箱的双打/ Ints来改进代码,但是我对hakell的了解还不止于此。
答案 0 :(得分:10)
我已经尝试过依赖懒惰的代码的原始版本:
module Main where
import System.Environment
import Control.Applicative
import System.Random
main :: IO ()
main = do
args <- getArgs
let n = if null args then 10000 else read $ head args
g <- getStdGen
let vals = randomRs (0, 1) g :: [Int]
let s = sum $ take n vals
putStrLn $ "True: " ++ f s n ++ ", False" ++ f (n - s) n
f x y = show $ ((fromIntegral x / fromIntegral y) :: Double)
现在,忽略我错过了一些类型声明的事实,我已经从模块中导入了所有内容。我只是想自由地测试。
回到城堡后,您的版本保存为original.hs
,而上述内容保存为1.hs
。测试时间:
[mihai@esgaroth so]$ ghc --make -O2 original.hs
[1 of 1] Compiling Main ( original.hs, original.o )
Linking original ...
[mihai@esgaroth so]$ ghc --make -O2 1.hs
[1 of 1] Compiling Main ( 1.hs, 1.o )
Linking 1 ...
[mihai@esgaroth so]$ time ./original
True: 0.4981, False 0.5019
real 0m0.022s
user 0m0.021s
sys 0m0.000s
[mihai@esgaroth so]$ time ./1
True: 0.4934, False0.5066
real 0m0.005s
user 0m0.003s
sys 0m0.001s
[mihai@esgaroth so]$ time ./original
True: 0.5063, False 0.4937
real 0m0.018s
user 0m0.017s
sys 0m0.001s
[mihai@esgaroth so]$ time ./1
True: 0.5024, False0.4976
real 0m0.005s
user 0m0.003s
sys 0m0.002s
每次,新代码都快了4倍。而这仍然是使用惰性结构和现有代码的第一个版本。
下一步是测试性能堆,看看在生成随机列表时是否值得嵌入和计算。
PS:在我的机器上:
[mihai@esgaroth so]$ time java MonteCarlo 10000
True: 0.5011, False: 0.4989
real 0m0.063s
user 0m0.066s
sys 0m0.010s
PPS:运行不使用-O2
编译的代码:
[mihai@esgaroth so]$ time ./original
True: 0.5035, False 0.4965
real 0m0.032s
user 0m0.031s
sys 0m0.001s
[mihai@esgaroth so]$ time ./1
True: 0.4975, False0.5025
real 0m0.014s
user 0m0.010s
sys 0m0.003s
只减少2次,但仍然比java快。
答案 1 :(得分:3)
这太大了,不能作为评论留下,但它不是一个真正的答案,只是一些数据。我接受了主循环并使用criterion
对其进行了测试。然后是一些变化。
import Control.Applicative
import System.Random
import Criterion.Main
import Data.List
iters :: Int
iters = 10000
loop1 :: IO Int
loop1 = go iters 0
where
go n acc | n < 0 = return acc
| otherwise = do
x <- randomRIO (0, 1 :: Double)
let acc' = acc + round x
acc' `seq` go (n - 1) acc'
loop1' :: IO Int
loop1' = go iters 0
where
go n acc | n < 0 = return acc
| otherwise = do
x <- randomRIO (0, 1)
let acc' = acc + x
acc' `seq` go (n - 1) acc'
loop2 :: IO Int
loop2 = do
g <- newStdGen
let s = foldl' (+) 0 . take iters . map round $ randomRs (0, 1 :: Double) g
s `seq` return s
loop2' :: IO Int
loop2' = do
g <- newStdGen
let s = foldl' (+) 0 . take iters $ randomRs (0, 1) g
s `seq` return s
loop3 :: IO Int
loop3 = do
g0 <- newStdGen
let go n acc g | n < 0 = acc
| otherwise = let (x, g') = randomR (0, 1 :: Double) g
acc' = acc + round x
in acc' `seq` go (n - 1) acc g'
return $! go iters 0 g0
loop3':: IO Int
loop3'= do
g0 <- newStdGen
let go n acc g | n < 0 = acc
| otherwise = let (x, g') = randomR (0, 1) g
acc' = acc + x
in acc' `seq` go (n - 1) acc g'
return $! go iters 0 g0
main :: IO ()
main = defaultMain $
[ bench "loop1" $ whnfIO loop1
, bench "loop2" $ whnfIO loop2
, bench "loop3" $ whnfIO loop3
, bench "loop1'" $ whnfIO loop1'
, bench "loop2'" $ whnfIO loop2'
, bench "loop3'" $ whnfIO loop3'
]
以下是时间:除了,好吧。我正在运行虚拟机vm,性能有点随机。但总体趋势是一致的。
carl@debian:~/hask$ ghc -Wall -O2 randspeed.hs
[1 of 1] Compiling Main ( randspeed.hs, randspeed.o )
Linking randspeed ...
carl@debian:~/hask$ ./randspeed
warming up
estimating clock resolution...
mean is 3.759461 us (160001 iterations)
found 3798 outliers among 159999 samples (2.4%)
1210 (0.8%) low severe
2492 (1.6%) high severe
estimating cost of a clock call...
mean is 2.152186 us (14 iterations)
found 3 outliers among 14 samples (21.4%)
1 (7.1%) low mild
1 (7.1%) high mild
1 (7.1%) high severe
benchmarking loop1
mean: 15.88793 ms, lb 15.41649 ms, ub 16.37845 ms, ci 0.950
std dev: 2.472512 ms, lb 2.332036 ms, ub 2.650680 ms, ci 0.950
variance introduced by outliers: 90.466%
variance is severely inflated by outliers
benchmarking loop2
mean: 26.44217 ms, lb 26.28822 ms, ub 26.64457 ms, ci 0.950
std dev: 905.7558 us, lb 713.3236 us, ub 1.165090 ms, ci 0.950
found 8 outliers among 100 samples (8.0%)
6 (6.0%) high mild
2 (2.0%) high severe
variance introduced by outliers: 30.636%
variance is moderately inflated by outliers
benchmarking loop3
mean: 18.43004 ms, lb 18.29330 ms, ub 18.60769 ms, ci 0.950
std dev: 794.3779 us, lb 628.6630 us, ub 1.043238 ms, ci 0.950
found 5 outliers among 100 samples (5.0%)
4 (4.0%) high mild
1 (1.0%) high severe
variance introduced by outliers: 40.516%
variance is moderately inflated by outliers
benchmarking loop1'
mean: 4.579197 ms, lb 4.494131 ms, ub 4.677335 ms, ci 0.950
std dev: 468.0648 us, lb 406.8328 us, ub 558.5602 us, ci 0.950
found 2 outliers among 100 samples (2.0%)
2 (2.0%) high mild
variance introduced by outliers: 80.019%
variance is severely inflated by outliers
benchmarking loop2'
mean: 4.473382 ms, lb 4.386545 ms, ub 4.567254 ms, ci 0.950
std dev: 460.5377 us, lb 410.1520 us, ub 543.1835 us, ci 0.950
found 1 outliers among 100 samples (1.0%)
variance introduced by outliers: 80.033%
variance is severely inflated by outliers
benchmarking loop3'
mean: 3.577855 ms, lb 3.490043 ms, ub 3.697916 ms, ci 0.950
std dev: 522.4125 us, lb 416.7015 us, ub 755.3713 us, ci 0.950
found 5 outliers among 100 samples (5.0%)
4 (4.0%) high mild
1 (1.0%) high severe
variance introduced by outliers: 89.406%
variance is severely inflated by outliers
以下是我的结论:
randomRs
速度很慢。太糟糕的列表不会与foldl'
融合,这会使这更快更简单。IO
中的递归相关的开销很小。它显示的是使用Int
而不是double的版本,但就是这样。Random
的{{1}}实例超级慢。或者Double
超级慢。或者两者的结合可能是。答案 2 :(得分:0)
正如评论所暗示的那样,随机数生成应该是罪魁祸首。 我做了一些实验,发现确实是这种情况。
@Mihai_Maruseac的示例比较了一个不同的解决方案,生成Int
而不是Double
值,这恰好有点快。但仍然在我的电脑上(Debian7,6Gib Ram,核心i5),java版本更快。
这是我的haskell文件比较不同的解决方案,旁注来说前四个循环被注释掉了,因为生成的样本数会产生堆栈溢出。
module Main where
import Control.Applicative ((<$>))
import Control.Monad (replicateM)
import Data.List (foldl')
import Control.Monad.Primitive (PrimMonad, PrimState)
import Data.Vector.Generic (Vector)
import qualified Data.Vector.Unboxed as V
import qualified Data.Vector.Generic as G
import System.Random.MWC
import System.Random (randomRIO
,getStdGen
,randomRs
,randomR
,RandomGen)
import Criterion.Main
main :: IO ()
main = do let n = 1000000 ::Int -- 10^6
defaultMain [
-- bench "loop1" $ whnfIO (loop1 n),
-- bench "loop2" $ whnfIO (loop2 n),
-- bench "loop3" $ whnfIO (loop3 n),
-- bench "loop4" $ whnfIO (loop4 n),
bench "loop5" $ whnfIO (loop5 n),
bench "loop6" $ whnfIO (loop6 n),
bench "loop7" $ whnfIO (loop7 n),
bench "moop5" $ whnfIO (moop5 n),
bench "moop6" $ whnfIO (moop6 n)]
loop1 :: Int -> IO Int
loop1 n = do rs <- replicateM n (randomRIO (0,1)) :: IO [Double]
return . length . filter (==True) $ map (<=0.5) rs
loop2 :: Int -> IO Int
loop2 n = do rs <- replicateM n (randomRIO (0,1)) :: IO [Double]
return $ foldl' (\x y -> x + round y) 0 rs
loop3 :: Int -> IO Int
loop3 n = loop n 0 (randomRIO (0,1))
where loop :: Int -> Int -> IO Double -> IO Int
loop n' acc x | n' <=0 = round <$> x
| otherwise = do x' <- x
loop (n'-1) (round x' + acc) x
loop4 :: Int -> IO Int
loop4 n = loop n 0 (randomRIO (0,1))
where loop :: Int -> Int -> IO Double -> IO Int
loop n' acc x | n'<0 = return acc
| otherwise = do x' <- round <$> x
let x'' = (x' + acc) in x'' `seq` loop (n'-1) x'' x
loop5 :: Int -> IO Int
loop5 n = do g <- getStdGen
return . sum . take n $ randomRs (0,1::Int) g
loop6 :: Int -> IO Int
loop6 n = do g <- getStdGen
return $ loop 0 n g
where loop :: (RandomGen t) => Int -> Int -> t -> Int
loop acc n' g | n'<0 = acc
| otherwise = let (m,g') = randomR (0,1::Int) g
acc' = acc + m
in acc' `seq` loop acc' (n'-1) g'
loop7 :: Int -> IO Int
loop7 n = do g <- getStdGen
return $ loop 0 n g
where loop :: (RandomGen t) => Int -> Int -> t -> Int
loop acc n' g | n'<0 = acc
| otherwise = let (m,g') = randomR (0,1::Double) g
acc' = acc + round m
in acc' `seq` loop acc' (n'-1) g'
moop5 :: Int -> IO Int
moop5 n = do vs <- withSystemRandom . asGenST $ \gen -> uniformVectorR (0,1) gen n
return . V.sum $ V.map round (vs :: V.Vector Double)
moop6 :: Int -> IO Int
moop6 n = do vs <- withSystemRandom . asGenST $ \gen -> uniformVectorR (0,1) gen n
return $ V.sum (vs :: V.Vector Int)
-- Helper functions ------------------------------------------------------------
report :: Int -> Int -> String -> IO ()
report n m msg = putStrLn $ msg ++ "\n" ++
"True: " ++ show ( m//n :: Double) ++ ", "++
"False: " ++ show ((n-m)//n :: Double)
(//) :: (Integral a, Fractional b) => a -> a -> b
x // y = fromIntegral x / fromIntegral y
uniformVectorR :: (PrimMonad m, Variate a, Vector v a) =>
(a, a) -> Gen (PrimState m) -> Int -> m (v a)
uniformVectorR (lo,hi) gen n = G.replicateM n (uniformR (lo,hi) gen)
{-# INLINE uniformVectorR #-}