在Scala Parser中表示ad-hoc运算符优先级

时间:2016-05-08 19:30:35

标签: scala parsing context-free-grammar

所以我试图使用scala RegexParsers为我正在玩的编程语言的算术片段编写一个解析器。 就目前而言,我的顶级表达式解析器的形式为:

parser: Parser[Exp] = binAppExp | otherKindsOfParserLike | lval | int

它接受lvals(像"a.b, a.b[c.d], a[b], {record=expression, like=this}"之类的东西就好了。现在,我想启用像"1 + b / c = d"这样的表达式,但可能使用(源语言,而不是Scala)编译时用户 - 定义的运算符。

我最初的想法是,如果我通过优先级递归地和数字地编码操作,那么我可以添加更高优先级ad-hoc,并且每个优先级可以解析消耗右侧的优先级较低的子项。操作表达。所以,我试图用一些相当普通的运营商来构建这个想法的玩具。 因此,我希望"1 * 2+1"能够解析为Call(*, Seq(1, Call(+ Seq(2,1)))),其中case class Call(functionName: String, args: Seq[Exp]) extends Exp

相反,它会解析为IntExp(1)

有没有理由不能这样做(它是否以我错过的方式递归?如果是这样,我确定还有别的错误,或者它永远不会终止,对吗?),还是由于其他原因而完全错了?

  def binAppExp: Parser[Exp] = {
    //assume a registry of operations
    val ops = Map(
      (7, Set("*", "/")),
      (6, Set("-", "+")),
      (4, Set("=", "!=", ">", "<", ">=", "<=")),
      (3, Set("&")),
      (2, Set("|"))
    )

    //relevant ops for a level of precedence
    def opsWithPrecedence(n: Int): Set[String] = ops.getOrElse(n, Set.empty)

    //parse an op with some level of precedence
    def opWithPrecedence(n: Int): Parser[String] = ".+".r ^? (
      { case s if opsWithPrecedence(n).contains(s) => s },
      { case s => s"SYMBOL NOT FOUND: $s" }
      )

    //assuming the parse happens, encode it as an AST representation
    def folder(h: Exp, t: LangParser.~[String, Exp]): CallExp =
      CallExp(t._1, Seq(h, t._2))

    val maxPrecedence: Int = ops.maxBy(_._1)._1

    def term: (Int => Parser[Exp]) = {
      case 0 => lval | int | notApp | "(" ~> term(maxPrecedence) <~ ")"
      case n =>
        val lowerTerm = term(n - 1)
        lowerTerm ~ rep(opWithPrecedence(n) ~ lowerTerm) ^^ {
          case h ~ ts => ts.foldLeft(h)(folder)
        }
    }

    term(maxPrecedence)
  }

1 个答案:

答案 0 :(得分:1)

好吧,所以我试图做的事情本来就没有什么是不可能的,这在细节上是错误的。

核心思想是:维护从优先级到运算符/解析器的映射,并递归地查找基于该表的解析。如果您允许使用括号表达式,只需在调用括号术语的过程中嵌套对最先前可能的解析器的调用。解析器。

以防万一其他人想要做这样的事情,这里是一组算术/逻辑运算符的代码,经过大量评论,将其与上述内容联系起来:

 def opExp: Parser[Exp] = {
sealed trait Assoc

val ops = Map(
  (1, Set("*", "/")),
  (2, Set("-", "+")),
  (3, Set("=", "!=", ">", "<", ">=", "<=")),
  (4, Set("&")),
  (5, Set("|"))
)

def opsWithPrecedence(n: Int): Set[String] = ops.getOrElse(n, Set.empty)

/* before, this was trying to match the remainder of the expression,
   so something like `3 - 2` would parse the Int(3),
   and try to pass "- 2" as an operator to the op parser.
   RegexParsers has an implicit def "literal : String => SubclassOfParser[String]",
   that I'm using explicitly here.
*/

def opWithPrecedence(n: Int): Parser[String] = {
  val ops = opsWithPrecedence(n)
  if (ops.size > 1) {
    ops.map(literal).fold (literal(ops.head)) {
      case (l1, l2) => l1 | l2
    }
  } else if (ops.size == 1) {
    literal(ops.head)
  } else {
    failure(s"No Ops for Precedence $n")
  }
}

def folder(h: Exp, t: TigerParser.~[String, Exp]): CallExp = CallExp(t._1, Seq(h, t._2))

val maxPrecedence: Int = ops.maxBy(_._1)._1

def term: (Int => Parser[Exp]) = {
  case 0 => lval | int | "(" ~> { term(maxPrecedence) } <~ ")"
  case n if n > 0 =>
    val lowerTerm = term(n - 1)
    lowerTerm ~ rep(opWithPrecedence(n) ~ lowerTerm) ^^ {
      case h ~ ts if ts.nonEmpty => ts.foldLeft(h)(folder)
      case h ~ _ => h
    }
}

term(maxPrecedence)

}