我正在学习Haskell两年了,我仍然感到困惑,是从单一输入线读取大量数字的最佳(最快)方法。 为了学习,我注册了hackerearth.com试图解决Haskell的每一个挑战。但现在我遇到了挑战,因为我遇到了超时问题。我的程序太慢,不适合网站接受。
使用分析器我发现解析具有大量整数的行需要80%以上的时间。当行中的值数量增加时,百分比会变得更高。
现在就是这样,我正在从输入行读取数字:
ConnectivityManager cm =
(ConnectivityManager)context.getSystemService(Context.CONNECTIVITY_SERVICE);
NetworkInfo activeNetwork = cm.getActiveNetworkInfo();
boolean isConnected = activeNetwork != null && activeNetwork.isConnectedOrConnecting();
if(isConnected){
//some code
}
有没有办法让数据更快地进入变量?
BTW:最大的测试用例包含一个具有200.000个9位数值的行。解析需要很长的时间(> 60s)。答案 0 :(得分:4)
总是很难宣布一种特定的方法“最快”,因为几乎总有一些方法可以挤出更多的性能。但是,使用Data.ByteString.Char8
的方法和您建议的一般方法应该是读取数字的最快方法之一。如果遇到性能不佳的情况,问题可能出在其他地方。
为了给出一些具体的结果,我生成了一个包含2000万个9位数字的191Meg文件,在一行上以空格分隔。然后,我尝试了几种读取数字行并打印其总和的一般方法(记录为10999281565534666)。使用String
的显而易见的方法:
reader :: IO [Int]
reader = map read . words <$> getLine
sum' xs = sum xs -- work around GHC ticket 10992
main = print =<< sum' <$> reader
花了52秒;使用Text
的类似方法:
import qualified Data.Text as T
import qualified Data.Text.IO as T
import qualified Data.Text.Read as T
readText = map parse . T.words <$> T.getLine
where parse s = let Right (n, _) = T.decimal s in n
以2.4秒运行(但请注意,需要修改它以处理负数!);以及使用Char8
的相同方法:
import qualified Data.ByteString.Char8 as C
readChar8 :: IO [Int]
readChar8 = map parse . C.words <$> C.getLine
where parse s = let Just (n, _) = C.readInt s in n
以1.4秒的速度跑了。所有示例均在GHC 8.0.2上使用-O2
编译。
作为比较基准,基于scanf
的C实现:
/* GCC 5.4.0 w/ -O3 */
#include <stdio.h>
int main()
{
long x, acc = 0;
while (scanf(" %ld", &x) == 1) {
acc += x;
}
printf("%ld\n", acc);
return 0;
}
大约2.5秒,与Text
实施相同。
您可以从Char8
实施中获得更多性能。使用手动解析器:
readChar8' :: IO [Int]
readChar8' = parse <$> C.getLine
where parse = unfoldr go
go s = do (n, s1) <- C.readInt s
let s2 = C.dropWhile C.isSpace s1
return (n, s2)
运行大约0.9秒 - 我没有尝试确定为什么会有差异,但编译器必须错过机会对words
- { - 1}}管道执行某些优化
使用Numbers.hs制作一些数字:
readInt
用Sum.hs找出他们的总和:
-- |Generate 20M 9-digit numbers:
-- ./Numbers 20000000 100000000 999999999 > data1.txt
import qualified Data.ByteString.Char8 as C
import Control.Monad
import System.Environment
import System.Random
main :: IO ()
main = do [n, a, b] <- map read <$> getArgs
nums <- replicateM n (randomRIO (a,b))
let _ = nums :: [Int]
C.putStrLn (C.unwords (map (C.pack . show) nums))
其中:
import Data.List
import qualified Data.Text as T
import qualified Data.Text.IO as T
import qualified Data.Text.Read as T
import qualified Data.Char8 as C
import qualified Data.ByteString.Char8 as C
import System.Environment
-- work around https://ghc.haskell.org/trac/ghc/ticket/10992
sum' xs = sum xs
readString :: IO [Int]
readString = map read . words <$> getLine
readText :: IO [Int]
readText = map parse . T.words <$> T.getLine
where parse s = let Right (n, _) = T.decimal s in n
readChar8 :: IO [Int]
readChar8 = map parse . C.words <$> C.getLine
where parse s = let Just (n, _) = C.readInt s in n
readHand :: IO [Int]
readHand = parse <$> C.getLine
where parse = unfoldr go
go s = do (n, s1) <- C.readInt s
let s2 = C.dropWhile C.isSpace s1
return (n, s2)
main = do [method] <- getArgs
let reader = case method of
"string" -> readString
"text" -> readText
"char8" -> readChar8
"hand" -> readHand
print =<< sum' <$> reader