用Haskell正则表达式库替换/替换

时间:2012-01-30 22:20:17

标签: regex haskell

是否有一个高级API用于在Haskell中使用正则表达式进行搜索和替换?特别是,我正在查看Text.Regex.TDFAText.Regex.Posix个包。我真的很喜欢某种类型的东西:

f :: Regex -> (ResultInfo -> m String) -> String -> m String

所以,例如,用“cat”代替“dog”你可以写

runIdentity . f "dog" (return . const "cat")    -- :: String -> String

或使用monad做更高级的事情,比如计算事件等等。

Haskell的文档非常缺乏。一些低级API注释为here

6 个答案:

答案 0 :(得分:28)

Text.Regex包中的subRegex怎么样?

Prelude Text.Regex> :t subRegex
subRegex :: Regex -> String -> String -> String

Prelude Text.Regex> subRegex (mkRegex "foo") "foobar" "123"
"123bar"

答案 1 :(得分:5)

我不知道创建此功能的任何现有功能,但我认为我最终会使用类似AllMatches [] (MatchOffset, MatchLength) instance of RegexContent的功能来模拟它:

replaceAll :: RegexLike r String => r -> (String -> String) -> String -> String
replaceAll re f s = start end
  where (_, end, start) = foldl' go (0, s, id) $ getAllMatches $ match re s
        go (ind,read,write) (off,len) =
          let (skip, start) = splitAt (off - ind) read 
              (matched, remaining) = splitAt len matched 
          in (off + len, remaining, write . (skip++) . (f matched ++))

replaceAllM :: (Monad m, RegexLike r String) => r -> (String -> m String) -> String -> m String
replaceAllM re f s = do
  let go (ind,read,write) (off,len) = do
      let (skip, start) = splitAt (off - ind) read 
      let (matched, remaining) = splitAt len matched 
      replacement <- f matched
      return (off + len, remaining, write . (skip++) . (replacement++))
  (_, end, start) <- foldM go (0, s, return) $ getAllMatches $ match re s
  start end

答案 2 :(得分:3)

基于@ rampion的回答,但是错误已修复,所以它不只是<<loop>>

replaceAll :: Regex -> (String -> String) -> String -> String
replaceAll re f s = start end
  where (_, end, start) = foldl' go (0, s, id) $ getAllMatches $ match re s
        go (ind,read,write) (off,len) =
            let (skip, start) = splitAt (off - ind) read 
                (matched, remaining) = splitAt len start 
            in (off + len, remaining, write . (skip++) . (f matched ++))

答案 3 :(得分:1)

也许这种方法适合你。

import Data.Array (elems)
import Text.Regex.TDFA ((=~), MatchArray)

replaceAll :: String -> String -> String -> String        
replaceAll regex new_str str  = 
    let parts = concat $ map elems $ (str  =~  regex :: [MatchArray])
    in foldl (replace' new_str) str (reverse parts) 

  where
     replace' :: [a] -> [a] -> (Int, Int) -> [a]
     replace' new list (shift, l)   = 
        let (pre, post) = splitAt shift list
        in pre ++ new ++ (drop l post)

答案 4 :(得分:1)

您可以使用Data.Text.ICU.Replace module中的replaceAll

Prelude> :set -XOverloadedStrings
Prelude> import Data.Text.ICU.Replace
Prelude Data.Text.ICU.Replace> replaceAll "cat" "dog" "Bailey is a cat, and Max is a cat too."
"Bailey is a dog, and Max is a dog too."

答案 5 :(得分:0)

对于“用monad进行更高级的操作,例如计算发生次数等”进行“搜索和替换”,我建议Replace.Megaparsec.streamEditT

有关如何计算发生次数的具体示例,请参见README软件包。