你如何计算字符串中字符串的出现次数(Haskell)

时间:2012-03-14 22:33:47

标签: haskell

Haskell中是否有一些具有类似功能的库?

> function "bal bla hu bla" ["bla","bal"]
[(2,"bla"),(1,"bal")]

5 个答案:

答案 0 :(得分:3)

Data.Text提供了count功能,但它适用于Text而不是String,因此我们必须使用pack

import Data.Text (pack, count)

function haystack needles = map go needles
     where 
        packed = pack haystack
        go needle = (count (pack needle) packed, needle)

答案 1 :(得分:3)

使用Data.List.Split

occs ∷ Eq a ⇒ [a] → [[a]] → [(Int, [a])]
occs str = map (count str &&& id)
  where
    count s x = length (splitOn x s) - 1

 >occs "bal bla hu bla" ["bla","bal"]
 [(2,"bla"),(1,"bal")]

<强> UPD

Parsec在这里也很有用。

{-# LANGUAGE NoMonomorphismRestriction,
             FlexibleContexts
  #-}
import Control.Arrow ((&&&))
import Data.Either (partitionEithers)
import Text.Parsec

occs :: String -> [String] -> [(Int, String)]
occs s = map (countP s &&& id)

countP str substr = either (const 0) occsNumber $ parse (parseMany substr) "" str
  where
    occsNumber = length . snd . partitionEithers

parseSingle :: Stream s m Char => String -> ParsecT s u m (Either Char String)
parseSingle s = fmap Right (try (string s)) <|> fmap Left anyChar

parseMany :: Stream s m Char => String -> ParsecT s u m [Either Char String]
parseMany = many . parseSingle

结果仍然相同:

> occs "bal bla hu bla" ["bla","bal"]
[(2,"bla"),(1,"bal")]

答案 2 :(得分:2)

import Control.Arrow ((&&&))
import Data.List (isPrefixOf, tails)

yourFunction :: Eq a => [a] -> [[a]] -> [(Int, [a])]
yourFunction haystack = map (count &&& id)
  where count needle = length . filter (needle `isPrefixOf`) . tails $ haystack

(未测试)。

答案 3 :(得分:1)

OMG OMG!到目前为止给出的每个解决方案都具有二次计算复(即使Data.Text具有最坏情况的二次运行时间)

显然你需要推出自己的字符串搜索算法!

这是我的看法。我认为这是KMP的变体。

data Searcher = Found | Initial Searcher | Searching Char Searcher Searcher

runSearcher :: Searcher -> Char -> Searcher
runSearcher (Searching c suc fail) s | c == s = suc
 | otherwise = runSearcher fail s
runSearcher (Initial s) _ = s

mkSearcher pattern = initial where
  initial = go (Initial initial) pattern

  go fallback [] = Found
  go fallback (c:t) = Searching c (go (runSearcher fallback c) t) fallback

search :: String -> String -> Integer
search pat = go searcher where
  searcher = mkSearcher pat
  go Found s = 1 + go searcher s
  go src (c:t) = go (runSearcher src c) t
  go src [] = 0

尽管如此,仍有很大的改进空间!如果我们通过构建前缀树或类似的东西来预处理输入字符串,那么搜索多个模式可以比一个一个地更有效地完成......

答案 4 :(得分:0)

“火车残骸”解决方案:

import Data.List

f txt ws = map freq $ filter isElem $ group $ sort $ words txt where
    isElem (w:_) = w `elem` ws
    freq xs@(x:_) = (length xs, x)

从那里,你可以去“monadic”

import Data.List
import Control.Monad

f txt ws = do
    xs@(x:_) <- group $ sort $ words txt 
    guard $ x `elem` ws
    return (length xs, x)