Question

众所周知，人们不会使用[Char]来读取Haskell中的大量数据。一个人使用ByteString来完成这项工作。通常的解释是Char s很大，列表会增加它们的开销。

但是，这似乎不会导致输出出现任何问题。

例如以下程序：

main = interact $ const $ unwords $ map show $ replicate 500000 38000000

在我的计算机上运行只需131毫秒，而以下一个：

import Data.List

sum' :: [Int] -> Int
sum' = foldl' (+) 0

main = interact $ show . sum' . map read . words

如果输入第一个程序的输出作为输入，

需要3.38秒！

使用String s的输入和输出性能之间存在这种差异的原因是什么？

Answer 1

我认为这个问题不一定与I / O有关。相反，它表明Read的{{1}}实例非常低效。

首先，考虑以下只处理惰性列表的程序。我的机器需要4.1秒（用Int编译）：

-O2

用main = print $ sum' $ map read $ words $ unwords $ map show $ replicate 500000 38000000替换read函数会将时间减少到0.48秒：

length

此外，用手写版本替换main = print $ sum' $ map length $ words $ unwords $ map show $ replicate 500000 38000000函数会导致0.52秒的时间：

read

我猜测为什么main = print $ sum' $ map myread $ words $ unwords $ map show $ replicate 500000 38000000 myread :: String -> Int myread = loop 0 where loop n [] = n loop n (d:ds) = let d' = fromEnum d - fromEnum '0' :: Int n' = 10 * n + d' in loop n' ds如此低效的原因是它的实现使用read模块，对于读取单个整数的简单情况，这可能不是最快的选择。

为什么基于[Char]的输入比Haskell中基于[Char]的输出慢得多？

1 个答案: