区分空的正则表达式匹配与Haskell中的无匹配

时间:2017-09-19 19:55:17

标签: regex haskell pcre

我尝试使用a,但regex-pcre包含过多的regex-base重载,所以我不知道应该将哪一个用于手头的任务

我希望通过以下方式将字符串与RegexContext正则表达式匹配:

(foo)-(bar)|(quux)-(quux)(q*u*u*x*)

示例输出:

    由于没有匹配,
  • myMatch :: String -> Maybe (String, String, Maybe String) 应为myMatch "dfjdjk"

  • Nothing应为myMatch "foo-bar",因为第一个备选方案中没有第三个捕获组

  • Just ("foo", "bar", Nothing)应为myMatch "quux-quuxqu"

  • Just ("quux", "quux", Just "qu")应为myMatch "quux-quux",因为第三个捕获组存在但空

这不是一项任务,我对于https://github.com/erantapaa/haskell-regexp-examples/blob/master/RegexExamples.hs如何在没有匹配或没有捕获组的情况下包含代码路径感到困惑

3 个答案:

答案 0 :(得分:3)

实现它的一种方法是使用getAllTextSubmatches

import Text.Regex.PCRE

myMatch :: String -> Maybe (String, String, Maybe String)
myMatch str = case getAllTextSubmatches $ str =~ "(foo)-(bar)|(quux)-(quux)(q*u*u*x*)" :: [String] of
  []                      -> Nothing
  [_, g1, g2, "", "", ""] -> Just (g1, g2, Nothing)
  [_, "", "", g3, g4, g5] -> Just (g3, g4, Just g5)

getAllTextSubmatches具有[String]作为返回类型时,如果没有匹配则返回空列表,或者返回第一个匹配的所有捕获组(其中索引0是整个匹配)的列表

或者,如果匹配的组可能为空且您无法对空字符串进行模式匹配,则可以使用[(String, (MatchOffset, MatchLength))]作为getAllTextSubmatches的返回类型,将模式匹配MatchOffset与-1一起使用识别不匹配的群体:

myMatch :: String -> Maybe (String, String, Maybe String)
myMatch str = case getAllTextSubmatches $ str =~ "(foo)-(bar)|(quux)-(quux)(q*u*u*x*)" :: [(String, (MatchOffset, MatchLength))] of
  []                                                              -> Nothing
  [_, (g1, _), (g2, _), (_, (-1, _)), (_, (-1, _)), (_, (-1, _))] -> Just (g1, g2, Nothing)
  [_, (_, (-1, _)), (_, (-1, _)), (g3, _), (g4, _), (g5, _)]      -> Just (g3, g4, Just g5)

现在,如果这看起来过于冗长:

{-# LANGUAGE PatternSynonyms #-}

pattern NoMatch = ("", (-1, 0))

myMatch :: String -> Maybe (String, String, Maybe String)
myMatch str = case getAllTextSubmatches $ str =~ "(foo)-(bar)|(quux)-(quux)(q*u*u*x*)" :: [(String, (MatchOffset, MatchLength))] of
  []                                               -> Nothing
  [_, (g1, _), (g2, _), NoMatch, NoMatch, NoMatch] -> Just (g1, g2, Nothing)
  [_, NoMatch, NoMatch, (g3, _), (g4, _), (g5, _)] -> Just (g3, g4, Just g5)

答案 1 :(得分:1)

要区分何时不匹配,请使用=~~,以便将结果放入Maybe monad。如果没有匹配,它将使用fail返回Nothing

myMatch :: String -> Maybe (String, String, Maybe String)
myMatch str = do
    let regex = "(foo)-(bar)|(quux)-(quux)(q*u*u*x*)"
    groups <- getAllTextSubmatches <$> str =~~ regex :: Maybe [String]
    case groups of
        [_, g1, g2, "", "", ""] -> Just (g1, g2, Nothing)
        [_, "", "", g3, g4, g5] -> Just (g3, g4, Just g5)

答案 2 :(得分:0)

使用regex-applicative

myMatch = match re
re = foobar <|> quuces where
    foobar = (,,) <$> "foo" <* "-" <*> "bar" <*> pure Nothing
    quuces = (,,)
        <$> "quux" <* "-"
        <*> "quux"
        <*> (fmap (Just . mconcat) . sequenceA)
            [many $ sym 'q', many $ sym 'u', many $ sym 'u', many $ sym 'x']

或者,使用ApplicativeDo,

re = foobar <|> quuces where
    foobar = do
        foo <- "foo"
        _ <- "-"
        bar <- "bar"
        pure (foo, bar, Nothing)
    quuces = do
        quux1 <- "quux"
        _ <- "-"
        quux2 <- "quux"
        quux3 <- fmap snd . withMatched $
            traverse (many . sym) ("quux" :: [Char])
            -- [many $ sym 'q', many $ sym 'u', many $ sym 'u', many $ sym 'x']
        pure (quux1, quux2, Just quux3)