如果解析器失败,请尝试从下一个特殊符号出现进行解析

时间:2013-11-10 00:51:04

标签: parsing haskell attoparsec

假设有一些解析器:

valid :: Parser String
valid = string "valid" <* skipWhile (/= '\n')

它可用于从多行文字中获取"valid"字符串:

> parseOnly (many $ valid <* optional endOfLine) "valid\nvalid\nvalid"
Right ["valid","valid","valid"]

如果有valid分析器失败的行,则根本不会解析其他文本:

> parseOnly (many $ valid <* optional endOfLine) "valid\ninvalid\nvalid"
Right ["valid"]

如何取代Rigth["valid", "valid"]?我认为try在某种程度上可能对此有所帮助,但不确定如何从下一行继续解析。

1 个答案:

答案 0 :(得分:3)

使用parsec:

-- parser for the rest of the line
rest = manyTill anyChar (eof <|> char '\n' *> return ()) <* optional (char '\n')

-- change this to accept lines, but Just the valid ones
valid :: Parser (Maybe String)
valid = (Just <$> string "valid" <|> const Nothing <$> anyChar) <* rest

-- filter out Nothing
valids = catMaybes <$> many valid

-- Run
*Foo> runParser valids () "input" "valid1\ninvvalid2\nvalid3"
Right ["valid","valid"]
*Foo> runParser valids () "input" "valid1\nvalid2\nvalid3"
Right ["valid","valid","valid"]

在这里,我必须制作一个错误的黑客:const Nothing <$> anyChar所以valid至少消耗一些东西,否则我无法将其交给many。但是,使用Maybe可以根据需要重写解析器(例如,强制使用换行符)

非常类似的方法适用于attoparsec,对不起破坏自己制作它的乐趣。

{-# LANGUAGE OverloadedStrings #-}
import Data.Attoparsec.Text
import Control.Applicative
import Data.Maybe
import Data.Text

-- parser for the rest of the line
rest = skipWhile (/= '\n') <* optional endOfLine

-- change this to accept lines, but Just the valid ones
valid :: Parser (Maybe Text)
valid = (Just <$> string "valid" <|> const Nothing <$> anyChar) <* rest

-- filter out Nothing
valids = catMaybes <$> many valid
*Main> parseOnly valids "valid1\nvalid2\nvalid3"
Right ["valid","valid","valid"]
*Main> parseOnly valids "valid1\ninvalid2\nvalid3"
Right ["valid","valid"]