所以我试图使用scala RegexParsers为我正在玩的编程语言的算术片段编写一个解析器。 就目前而言,我的顶级表达式解析器的形式为:
parser: Parser[Exp] = binAppExp | otherKindsOfParserLike | lval | int
它接受lvals(像"a.b, a.b[c.d], a[b], {record=expression, like=this}"
之类的东西就好了。现在,我想启用像"1 + b / c = d"
这样的表达式,但可能使用(源语言,而不是Scala)编译时用户 - 定义的运算符。
我最初的想法是,如果我通过优先级递归地和数字地编码操作,那么我可以添加更高优先级ad-hoc,并且每个优先级可以解析消耗右侧的优先级较低的子项。操作表达。所以,我试图用一些相当普通的运营商来构建这个想法的玩具。
因此,我希望"1 * 2+1"
能够解析为Call(*, Seq(1, Call(+ Seq(2,1))))
,其中case class Call(functionName: String, args: Seq[Exp]) extends Exp
。
相反,它会解析为IntExp(1)
。
有没有理由不能这样做(它是否以我错过的方式递归?如果是这样,我确定还有别的错误,或者它永远不会终止,对吗?),还是由于其他原因而完全错了?
def binAppExp: Parser[Exp] = {
//assume a registry of operations
val ops = Map(
(7, Set("*", "/")),
(6, Set("-", "+")),
(4, Set("=", "!=", ">", "<", ">=", "<=")),
(3, Set("&")),
(2, Set("|"))
)
//relevant ops for a level of precedence
def opsWithPrecedence(n: Int): Set[String] = ops.getOrElse(n, Set.empty)
//parse an op with some level of precedence
def opWithPrecedence(n: Int): Parser[String] = ".+".r ^? (
{ case s if opsWithPrecedence(n).contains(s) => s },
{ case s => s"SYMBOL NOT FOUND: $s" }
)
//assuming the parse happens, encode it as an AST representation
def folder(h: Exp, t: LangParser.~[String, Exp]): CallExp =
CallExp(t._1, Seq(h, t._2))
val maxPrecedence: Int = ops.maxBy(_._1)._1
def term: (Int => Parser[Exp]) = {
case 0 => lval | int | notApp | "(" ~> term(maxPrecedence) <~ ")"
case n =>
val lowerTerm = term(n - 1)
lowerTerm ~ rep(opWithPrecedence(n) ~ lowerTerm) ^^ {
case h ~ ts => ts.foldLeft(h)(folder)
}
}
term(maxPrecedence)
}
答案 0 :(得分:1)
好吧,所以我试图做的事情本来就没有什么是不可能的,这在细节上是错误的。
核心思想是:维护从优先级到运算符/解析器的映射,并递归地查找基于该表的解析。如果您允许使用括号表达式,只需在调用括号术语的过程中嵌套对最先前可能的解析器的调用。解析器。
以防万一其他人想要做这样的事情,这里是一组算术/逻辑运算符的代码,经过大量评论,将其与上述内容联系起来:
def opExp: Parser[Exp] = {
sealed trait Assoc
val ops = Map(
(1, Set("*", "/")),
(2, Set("-", "+")),
(3, Set("=", "!=", ">", "<", ">=", "<=")),
(4, Set("&")),
(5, Set("|"))
)
def opsWithPrecedence(n: Int): Set[String] = ops.getOrElse(n, Set.empty)
/* before, this was trying to match the remainder of the expression,
so something like `3 - 2` would parse the Int(3),
and try to pass "- 2" as an operator to the op parser.
RegexParsers has an implicit def "literal : String => SubclassOfParser[String]",
that I'm using explicitly here.
*/
def opWithPrecedence(n: Int): Parser[String] = {
val ops = opsWithPrecedence(n)
if (ops.size > 1) {
ops.map(literal).fold (literal(ops.head)) {
case (l1, l2) => l1 | l2
}
} else if (ops.size == 1) {
literal(ops.head)
} else {
failure(s"No Ops for Precedence $n")
}
}
def folder(h: Exp, t: TigerParser.~[String, Exp]): CallExp = CallExp(t._1, Seq(h, t._2))
val maxPrecedence: Int = ops.maxBy(_._1)._1
def term: (Int => Parser[Exp]) = {
case 0 => lval | int | "(" ~> { term(maxPrecedence) } <~ ")"
case n if n > 0 =>
val lowerTerm = term(n - 1)
lowerTerm ~ rep(opWithPrecedence(n) ~ lowerTerm) ^^ {
case h ~ ts if ts.nonEmpty => ts.foldLeft(h)(folder)
case h ~ _ => h
}
}
term(maxPrecedence)
}