有人可以发一个使用IndentParser的小例子吗?我想解析类似YAML的输入,如下所示:
fruits:
apples: yummy
watermelons: not so yummy
vegetables:
carrots: are orange
celery raw: good for the jaw
我知道有一个YAML包。我想学习IndentParser的用法。
答案 0 :(得分:2)
我在下面勾勒出一个解析器,针对您的问题,您可能只需要块 来自IndentParser的解析器。注意我没有尝试运行它,因此它可能有基本错误。
解析器的最大问题不是缩进,而是只有字符串和冒号作为标记。您可能会发现下面的代码需要进行相当多的调试,因为它必须非常敏感而不会消耗太多的输入,尽管我已经尝试过关注左因素。因为你只有两个令牌,所以从Parsec的Token模块中获得的好处并不多。
请注意,解析简单的格式通常不易解析。为了学习,为简单表达式编写解析器会教你更多或多或少的任意文本格式(这可能只会让你感到沮丧)。
data DefinitionTree = Nested String [DefinitionTree]
| Def String String
deriving (Show)
-- Note - this might need some testing.
--
-- This is a tricky one, the parser has to parse trailing
-- spaces and tabs but not a new line.
--
category :: IndentCharParser st String
category = do
{ a <- body
; rest
; return a
}
where
body = manyTill1 (letter <|> space) (char ':')
rest = many (oneOf [' ', '\t'])
-- Because the DefinitionTree data type has two quite
-- different constructors, both sharing the same prefix
-- 'category' this combinator is a bit more complicated
-- than usual, and has to use an Either type to descriminate
-- between the options.
--
definition :: IndentCharParser st DefinitionTree
definition = do
{ a <- category
; b <- (textL <|> definitionsR)
; case b of
Left ss -> return (Def a ss)
Right ds -> return (Nested a ds)
}
-- Note this should parse a string *provided* it is on
-- the same line as the category.
--
-- However you might find this assumption needs verifying...
--
textL :: IndentCharParser st (Either DefinitionTrees a)
textL = do
{ ss <- manyTill1 anyChar "\n"
; return (Left ss)
}
-- Finally this one uses an indent parser.
--
definitionsR :: IndentCharParser st (Either a [DefinitionTree])
definitionsR = block body
where
body = do { a <- many1 definition; return (Right a) }