我有一个包含以下许多数据格式的文件:
Dan Clark’s Profile Photo
Member Name
Dan Clark 2nd degree connection 2nd
Member Occupation
Founder and Headmaster at Some Company, LLC
Nina blalba’s Profile Photo
Member Name
Nina blabla 2nd degree connection 2nd
Member Occupation
Consultant - GAmes executive search
我的解析器解析上面的文件:
module Main where
import Control.Applicative
import Control.Monad
import Text.ParserCombinators.Parsec hiding (many, (<|>))
data Contact = Contact {
name :: String,
occupation :: String,
company :: String
} deriving Show
matchContact :: Parser Contact
matchContact = do
name <- many anyChar
char '\''
string "s Profile Photo"
char '\n'
string "Member Name"
char '\n'
string name
many anyChar
char '\n'
string "Member Occupation"
char '\n'
job <- many anyChar
try $ string " at "
company <- many anyChar
try (char '\n')
return $ Contact name job company
main = do
c <- parseFromFile (many matchContact <* eof) "contacts.txt"
print c
有许多问题,例如数据不规律。但最紧急的是我总是在输入文件的最后一行遇到错误:
Left "contacts.txt" (line 8670, column 12):
unexpected end of input
expecting "'"
如何解决这个问题?
答案 0 :(得分:5)
您尝试many anyChar
的第一个实例,解析器会愉快地将文件的其余部分解析为字符串name
,因为后面的所有内容都明确地符合标准< em>任何字符(包括换行符)。这显然不是你想要的。
使用manyTill
,或限制允许字符的选择,以便name
在适当的位置结束。