Question

我想使用Haskell的parsec库实现这个语法规则：

((a | b | c)* (a | b))?

哪个是接受可选（即可能为空）字符串的解析器规则。如果它所接受的字符串不为空，那么可以通过传递a b或c解析器的零次或多次来消耗它，但是最外面的接受字符串{ {1}}可选解析器必须由解析器?或a消费，而不是b。这是一个例子：

c

首次尝试可能如下所示：

module Main where import Text.Parsec import Text.Parsec.Text a,b,c :: GenParser () Char a = char 'a' b = char 'b' c = char 'c' -- ((a | b | c)* (a | b))? myParser = undefined shouldParse1,shouldParse2,shouldParse3, shouldParse4,shouldFail :: Either ParseError String -- these should succeed shouldParse1 = runParser myParser () "" "" -- because ? optional shouldParse2 = runParser myParser () "" "b" shouldParse3 = runParser myParser () "" "ccccccb" shouldParse4 = runParser myParser () "" "aabccab" -- this should fail because it ends with a 'c' shouldFail = runParser myParser () "" "aabccac" main = do print shouldParse1 print shouldParse2 print shouldParse3 print shouldParse4 print shouldFail

但是myParser = option "" $ do str <- many (a <|> b <|> c) ch <- a <|> b return (str ++ [ch])只会在每个测试用例中消耗所有'a''b'和'c'字符，而many不会消耗任何字符。

问题：

使用parsec组合器，a <|> b定义((a | b | c)* (a | b))?的正确实现是什么？

Answer 1

我们也可以说明这一点略有不同：解析器中的c只有在其后跟任何令牌时才会成功，这可以通过单个lookAhead完成：

myParser = many (a <|> b <|> (c <* (lookAhead anyToken <?> "non C token"))) <* eof

Haskell parsec：`many` combinator在`optional`组合器中

1 个答案: