我尝试使用关键字之间的一系列数据解析以下文本文件:
many text many text many text
BEGIN
T LISTE2
1 154
2 321
3 519
4 520
5 529
6 426
END
many text many text many text
使用以下haskell程序
import Text.Parsec
import Text.Parsec.String
import Text.Parsec.Char
import Text.Parsec.Combinator
endOfLine :: Parser String
endOfLine = try (string "\n")
<|> try (string "\r\n")
line = many $ noneOf "\n"
parseListing = do
spaces
many $ noneOf "\n"
spaces
cont <- between (string "BEGIN\n") (string "END\n") $ endBy line endOfLine
spaces
many $ noneOf "\n"
spaces
eof
return cont
main :: IO ()
main = do
file <- readFile ("test_list.txt")
case parse parseListing "(stdin)" file of
Left err -> do putStrLn "!!! Error !!!"
print err
Right resu -> do putStrLn $ concat resu
当我解析我的文本文件时,我收到以下错误:
"(stdin)" (line 16, column 1):
unexpected end of input
expecting "\n", "\r\n" or "END\n"
我是解析新手,我不明白为什么会失败?
我的序列介于BEGIN
和END
您知道我的解析器有什么问题以及如何纠正它吗?
答案 0 :(得分:4)
between
永远不会停止,因为endBy line endOfLine
也消耗任何行和END\n
,因此它会消耗越来越多的行,直到失败为止。
然后你的解析器会尝试使用string "END\n"
并且也会失败,这就是错误消息提到"END\n"
的原因
您必须在END\n
上重写行解析器以使其失败。例如:
parseListing :: Parsec String () [String]
parseListing = do
spaces
many $ noneOf "\n"
spaces
cont <- between begin end $ endBy (notFollowedBy end >> line) endOfLine
spaces
many $ noneOf "\n"
spaces
eof
return cont
where
begin = string "BEGIN\n"
end = string "END\n"