Haskell通过第一个空间分解单词

时间:2011-08-16 03:24:33

标签: string haskell

请注意,这与使用单词function。

不同

我想从此转换:

"The quick brown fox jumped over the lazy dogs."

进入这个:

["The"," quick"," brown"," fox"," jumped"," over"," the"," lazy"," dogs."]

请注意每个单词后第一个空格的分隔符。

我能想到的最好的是:

parts "" = []
parts s  = if null a then (c ++ e):parts f else a:parts b
    where
    (a, b) = break isSpace s
    (c, d) = span isSpace s
    (e, f) = break isSpace d

它看起来有点不雅观。谁能想到更好的表达方式?

7 个答案:

答案 0 :(得分:6)

修改 - 抱歉,我没有看过这个问题。希望这个新答案符合您的要求。

> List.groupBy (\x y -> y /= ' ') "The quick brown fox jumped over the lazy dogs."
["The"," quick"," brown"," fox"," jumped"," over"," the"," lazy"," dogs."]

库函数groupBy采用谓词函数,告诉您是否将下一个元素 y 添加到上一个列表中,该列表以 x 开头,或者开始一个新的清单。

在这种情况下,我们不关心当前列表的开头,我们只想在下一个元素 y 时启动一个新列表(即使谓词计算为false),是一个空间。

修改

牛米。指出多个空间的处理不正确。在这种情况下,您可以切换到Data.List.HT,它具有您想要的语义。

> import Data.List.HT as HT
> HT.groupBy (\x y -> y /= ' ' || x == ' ') "a  b c d"
["a","  b"," c"," d"]

使这项工作的不同语义是 x 是上一个列表中的最后一个元素(您可以添加 y 或创建新列表)

答案 1 :(得分:3)

如果您正在进行许多略有不同类型的拆分,请查看split包。该程序包允许您将此拆分定义为split (onSublist [" "])

答案 2 :(得分:1)

words2 xs = head w : (map (' ':) $ tail w)
  where w = words xs

这里有箭头和应用:(不推荐实用)

words3 = words >>> (:) <$> head <*> (map (' ':) . tail)
编辑:我的第一个解决方案是错误的,因为它占用了额外的空间。这是正确的:

words4 = foldr (\x acc -> if x == ' ' || head acc == "" || (head $ head acc) /= ' '  
                             then (x : head acc) : tail acc
                             else [x] : acc) [""]

答案 3 :(得分:0)

这是我的看法

break2 :: (a->a->Bool) -> [a] -> ([a],[a])
break2 f (x:(xs@(y:ys))) = if f x y then ([x],xs) else (x:u,us) 
                              where (u,us) = break2 f xs
break2 f xs = (xs, [])

onSpace x y = not (isSpace x) && isSpace y

words2 "" = []
words2 xs = y : words2 ys where (y,ys) = break2 onSpace xs

答案 4 :(得分:0)

parts xs = foldr spl [] xs where
   spl x [] = [[x]]
   spl ' ' (xs:xss) = (' ':xs):xss    
   spl x xss@((' ':_):_) = [x]:xss    
   spl x (xs:xss) = (x:xs):xss   

答案 5 :(得分:0)

我喜欢拆分包的想法,但split (onSublist [" "])没有做我想要的,我找不到在一个或多个空格上拆分的解决方案。

也喜欢使用Data.List.HT的解决方案,但如果可能的话,我想远离依赖项。

最干净,我能想出来:

parts s 
    | null s    = []
    | null a    = (c ++ e) : parts f
    | otherwise = a        : parts b
    where
    (a, b) = break isSpace s
    (c, d) = span  isSpace s
    (e, f) = break isSpace d

答案 6 :(得分:0)

在这里。请享用! :d

 words' :: String -> [String]
    words' [] = []
    words' te@(x:xs) | x==' ' || x=='\t' || x=='\n' = words' xs
                     | otherwise                = a : words' b
      where
        (a, b) = break isSpace te