传统上,算术运算符被认为是二进制(左或右关联),因此大多数工具只处理二元运算符。
是否有一种简单的方法可以使用Parsec解析算术运算符,Parsec可以有任意数量的参数?
例如,应将以下表达式解析为树
(a + b) + c + d * e + f
答案 0 :(得分:1)
是的!关键是要首先解决一个更简单的问题,即将+
和*
建模为只有两个子节点的树节点。要添加四件事,我们只需使用+
三次。
这是一个很难解决的问题,因为这个问题只有一个Text.Parsec.Expr
模块。您的示例实际上是example code in the documentation可解析的。我在这里稍微简化了一下:
module Lib where
import Text.Parsec
import Text.Parsec.Language
import qualified Text.Parsec.Expr as Expr
import qualified Text.Parsec.Token as Tokens
data Expr =
Identifier String
| Multiply Expr Expr
| Add Expr Expr
instance Show Expr where
show (Identifier s) = s
show (Multiply l r) = "(* " ++ (show l) ++ " " ++ (show r) ++ ")"
show (Add l r) = "(+ " ++ (show l) ++ " " ++ (show r) ++ ")"
-- Some sane parser combinators that we can plagiarize from the Haskell parser.
parens = Tokens.parens haskell
identifier = Tokens.identifier haskell
reserved = Tokens.reservedOp haskell
-- Infix parser.
infix_ operator func =
Expr.Infix (reserved operator >> return func) Expr.AssocLeft
parser =
Expr.buildExpressionParser table term <?> "expression"
where
table = [[infix_ "*" Multiply], [infix_ "+" Add]]
term =
parens parser
<|> (Identifier <$> identifier)
<?> "term"
在GHCi中运行:
λ> runParser parser () "" "(a + b) + c + d * e + f"
Right (+ (+ (+ (+ a b) c) (* d e)) f)
有很多方法可以将此树转换为所需的形式。这是一个非常缓慢的问题:
data Expr' =
Identifier' String
| Add' [Expr']
| Multiply' [Expr']
deriving (Show)
collect :: Expr -> (Expr -> Bool) -> [Expr]
collect e f | (f e == False) = [e]
collect e@(Add l r) f =
collect l f ++ collect r f
collect e@(Multiply l r) f =
collect l f ++ collect r f
isAdd :: Expr -> Bool
isAdd (Add _ _) = True
isAdd _ = False
isMultiply :: Expr -> Bool
isMultiply (Multiply _ _) = True
isMultiply _ = False
optimize :: Expr -> Expr'
optimize (Identifier s) = Identifier' s
optimize e@(Add _ _) = Add' (map optimize (collect e isAdd))
optimize e@(Multiply _ _) = Multiply' (map optimize (collect e isMultiply))
但是,我会注意到,对于解析器或编译器来说,几乎总是Expr
是Good Enough™。