要学习Haskell,我一直在解决编程挑战。在试图解决Hackerrank中关于过滤数千个元素列表元素的问题时,我一直无法通过时间测试。
问题陈述:从元素列表中过滤那些出现超过 k 次的元素,并按照它们的出现顺序打印出来。
我到目前为止最好的是这段代码:
import qualified Data.ByteString.Char8 as BSC
import Data.Maybe (fromJust)
import Data.List (intercalate, elemIndices, foldl')
import qualified Data.Set as S
-- Improved version of nub
nub' :: (Ord a) => [a] -> [a]
nub' = go S.empty
where go _ [] = []
go s (x:xs) | S.member x s = go s xs
| otherwise = x : go (S.insert x s) xs
-- Extract Int from ByteString
getIntFromBS :: BSC.ByteString -> Int
getIntFromBS = fst . fromJust . BSC.readInt
{-
Parse read file:
a k1
n1 n2 n3 n4 ...
c k2
m1 m2 m3 m4 ...
into appropriate format:
[(k1, [n1,n2,n3,n4]), (k2, [m1,m2,m3,m4])]
-}
createGroups :: [BSC.ByteString] -> [(Int, [Int])]
createGroups [] = []
createGroups (p:v:xs) =
let val = getIntFromBS $ last $ BSC.split ' ' p
grp = foldr (\x acc -> getIntFromBS x : acc) [] $ BSC.split ' ' v
in (val, grp) : createGroups xs
solve :: (Int, [Int]) -> String
solve (k, v) = intercalate " " $ if null res then ["-1"] else res
where
go n acc =
if length (elemIndices n v) > k
then show n : acc
else acc
res = foldr go [] (nub' v)
fullSolve :: [BSC.ByteString] -> [String]
fullSolve xs = foldl' (\acc tupla -> acc ++ [solve tupla]) [] $ createGroups xs
main = do
BSC.getContents >>= mapM_ putStrLn . fullSolve . drop 1 . BSC.lines
我想知道在哪里可以改进此代码。我尝试了许多使用地图,向量,解析而不是将文件中的读取字符串解析为Int
的变体,但显示的代码是我所拥有的最佳代码。
答案 0 :(得分:1)
如果我必须解决这个问题,我可能会先尝试使用Data.Map.Strict
(对于O(log n )修改)隐藏在Control.Monad.State.Strict
monad的操作中变压器。
import Data.Map.Strict
import Control.Monad.State.Strict
type SIO x = StateT (Map String Int) IO x
incCount :: String -> Int -> Int -> Int
incCount _ _ old_val = 1 + old_val
incAndGetCount :: String -> SIO Int
incAndGetCount s = fmap unMaybe $ state $ insertLookupWithKey incCount s 1
where unMaybe (Just x) = x + 1
unMaybe Nothing = 1
processKey :: String -> SIO ()
processKey s = do
ct <- incAndGetCount s
if ct == 5 then lift (putStrLn s) else return ()
process :: [String] -> IO ()
process list = evalStateT (mapM_ processKey list) empty
虽然我觉得这段代码更优雅,但我无法知道在没有真正看到测试数据的情况下它是否更快。在任何情况下,这相当于一个命令式循环,它将字符串放入字典中,检索到目前为止看到它的次数,然后如果该数字为5则将该字符串打印到标准输出。
当然,您需要将其与适当的main
方法结合使用。
答案 1 :(得分:1)
即使在开始时我尝试使用Data.Map,但它缺少关于折叠与地图使用的评论中指出的优化,并且还缺少所需的输出顺序(按照外观顺序)。最终的解决方案如下:
{-# OPTIONS_GHC -O2 #-}
import Control.Monad (liftM, replicateM_)
import Data.Maybe (fromJust)
import Data.List (foldl', sort, unwords)
import qualified Data.Map.Strict as M
import qualified Data.ByteString.Char8 as BSC
getIntFromBS :: BSC.ByteString -> Int
getIntFromBS = fst.fromJust.BSC.readInt
solve :: Int -> [Int] -> String
solve k = unwords . map snd . sort . map finalPair . filter hasHighFreq . M.toList . foldl' insMap M.empty . zip [0..]
where
f _ _ (i, old_value) = (i, old_value + 1)
insMap m' (i, x) = M.insertWithKey f x (i,1) m'
hasHighFreq (_, (_, freq)) = freq >= k
finalPair (val, (i, freq)) = (i, show val)
main = do
n <- liftM getIntFromBS BSC.getLine
replicateM_ n $ do
[_, k] <- liftM (map getIntFromBS . BSC.words) BSC.getLine
vals <- liftM (map getIntFromBS . BSC.words) BSC.getLine
let res = solve k vals
putStrLn (if null res then "-1" else res)
答案 2 :(得分:0)
编辑:oops,这是错误的。产生的订单是“发现的”,但应该是“首先在原始列表中看到”。这可以通过索引和排序来调整......不幸的是,这意味着它不再具有生产力。
使用
,您可以提高此功能的效率,避免对索引和排序的任何需求import qualified Data.IntMap.Strict as M
import Data.List
-- mapAccumL :: (acc -> x -> (acc, y)) -> acc -> [x] -> (acc, [y])
solve :: Int -> [Int] -> [Int]
solve k ns = concat . snd $ mapAccumL g M.empty ns
where
g m n = case u of -- for each n in ns, with updating m,
Nothing -> (M.insert n 1 m, [n | k==1])
Just c -> (m2, [n | c==k-1])
where
(u,m2) = M.updateLookupWithKey (\n c-> Just (c+1)) n m
只要在输入列表中遇到 k 元素实例,我们就可以生成它。最终频率图被忽略。
最好让您的功能更专注。制作了Int
的列表后,您可以将其传递给unwords . map show
或您拥有的内容。
Data.IntMap.Strict说“......基准测试显示,与通用大小平衡的地图实现”相比,[IntMap]在插入和删除方面更快(更快)。