我正在尝试使用解析器组合器并使用由自定义数据类型定义的我自己的令牌来编写解析器,而不是像https://wiki.haskell.org/Parsing_a_simple_imperative_language中那样使用languageDef
到目前为止,我已经从这里https://jakewheat.github.io/intro_to_parsing/#very-simple-expression-parsing
获得了一个用于测试解析器的基本解析器。import Data.Char (isLetter, isDigit)
import Control.Monad
import Text.ParserCombinators.Parsec
import Text.ParserCombinators.Parsec.Char (digit)
regularParse :: Parser a -> String -> Either ParseError a
regularParse p = parse p ""
-- regularParse num "143"
num :: Parser Integer
num = do
n <- many1 digit
return (read n)
-- regularParse var "hello, world!"
var :: Parser String
var = do
fc <- firstChar
rest <- many nonFirstChar
return (fc:rest)
where
firstChar = satisfy (\a -> isLetter a || a == '_')
nonFirstChar = satisfy (\a -> isDigit a || isLetter a || a == '_')
在wiki.haskell.org的第一个示例中,使用emptyDef
中的Text.ParserCombinators.Parsec.Language
定义了语言和标记。这使他们可以像这样写语言
languageDef =
emptyDef { Token.commentStart = "/*"
, ...
, Token.reservedNames = [ "if"
, "while"
, ...
lexer = Token.makeTokenParser languageDef
reserved = Token.reserved lexer
whileStmt :: Parser Stmt
whileStmt =
do reserved "while"
cond <- bExpression
reserved "do"
stmt <- statement
return $ While cond stmt
如果我想编写自己的自定义数据类型并使用自己的令牌(如
),该怎么办?data Token = Zero
| One [Char]
| Two [Char] [Char]
token :: Parser Token
token = twoParser
<|> oneParser
<|> zeroParser
twoParser :: Parser Token
twoParser = do
two <- tExpression
a <- var
b <- var
return $ two a b
tExpression :: Parser Token
tExpression = do
...
var :: Parser String
var = do
fc <- firstChar
rest <- many nonFirstChar
return (fc:rest)
where
firstChar = satisfy (\a -> isLetter a || a == '_')
nonFirstChar = satisfy (\a -> isDigit a || isLetter a || a == '_')
我不确定这是否是解决此问题的正确方法,因为tExpression函数依赖语言定义中的许多内容。谢谢