Haskell在String中查找String的索引

时间:2012-07-11 12:18:38

标签: string haskell find substring

我目前正在尝试在另一个字符串中查找特定字符串的索引。 所以例如"ab"中字符串"ababa baab ab bla ab"的结果应为11和18。 如果目前有问题,我的功能我也得到索引0和8
我的功能:

findSubstringIndices :: String -> String -> [Int]
findSubstringIndices text pattern = map (add 1) (findIndices (pattern `isPrefixOf`) (tails text))

3 个答案:

答案 0 :(得分:6)

在这里挑选合适的设备很重要。

import Data.Char

您可以使用Prelude中函数words的略微修改版本,其定义为:

words :: String -> [String]
words s = case dropWhile isSpace s of
  "" -> []
  s' -> w : words s'' where (w, s'') = break isSpace s'

它将一个字符串分成一个以空格分隔的单词列表。修改将相当于将每个单词的索引标记为字符串。例如:

words' :: String -> [(Int, String)]
words' = go 0
  where
    go n s = case break (not . isSpace) s of
      (_, "")  -> []
      (ws, s') -> (n', w) : go (n' + length w) s''
                     where
                       n'       = n + length ws
                       (w, s'') = break isSpace s'

例如:

> words' "ababa baab ab bla ab"
[(0,"ababa"),(6,"baab"),(11,"ab"),(14,"bla"),(18,"ab")]

现在,编写函数findSubstringIndices几乎变得微不足道了:

findSubstringIndices :: String -> String -> [Int]
findSubstringIndices text pattern = [i | (i, w) <- words' text, w == pattern]

有用吗?是的,它确实:

> findSubstringIndices "ababa baab ab bla ab" "ab"
[11,18]

答案 1 :(得分:0)

findWordIndices' :: String -> String -> [Int]
findWordIndices' w = snd . foldl doStuff (0, []) . words
  where
    doStuff (cur, results) word =
        if word == w 
        then (cur + length word + 1, cur : results) 
        else (cur + length word + 1, results)  

然而,它以相反的顺序返回索引。

g>let str = "ababa baab ab bla ab"
str :: [Char]
g>findWordIndices' "ab" str
[18,11]
it :: [Int]

可以使用(++)代替cons((:))来解决此问题。

findWordIndices'' :: String -> String -> [Int]
findWordIndices'' w = snd . foldl doStuff (0, []) . words
  where
    doStuff (cur, results) word =
        if word == w 
        then (cur + length word + 1, results ++ [cur]) 
        else (cur + length word + 1, results)

g>let str = "ababa baab ab bla ab"
str :: [Char]
g>findWordIndices'' "ab" str
[11,18]
it :: [Int]

答案 2 :(得分:0)

words的另一种变体:

import Data.Char
import Control.Arrow

words' s = 
  case dropWhile isSpace' s of
    [] -> []
    s' -> ((head >>> snd) &&& map fst) w : words' s'' 
          where (w, s'') = break isSpace' s'
  where isSpace' = fst >>> isSpace

indices text pattern =
  map fst $ filter (snd >>> ((==) pattern)) $ words' $ zip text [0..]

main = do
  putStrLn $ show $ indices "ababa baab ab bla ab" "ab"