我试图用解析器组合器创建一个非常简单的解析器(解析类似于BNF的东西)。我查了几篇解释此事的博客文章(谷歌排名靠前的那些(对我来说)),我想我理解了,但测试说不然。
我已经检查了StackOverflow中的问题,虽然有些可能会在我尝试应用其他东西时应用并且有用,但是最好的方法是通过一个具体的例子:
这是我的主要内容:
def main(args: Array[String]) {
val parser: BaseParser = new BaseParser
val eol = sys.props("line.separator")
val test = s"a = b ${eol} a = c ${eol}"
System.out.println(test)
parser.parse(test)
}
这是解析器:
import com.github.trylks.tests.parser.ParserClasses._
import scala.util.parsing.combinator.syntactical._
import scala.util.parsing.combinator.ImplicitConversions
import scala.util.parsing.combinator.PackratParsers
class BaseParser extends StandardTokenParsers with ImplicitConversions with PackratParsers {
val eol = sys.props("line.separator")
lexical.delimiters += ("=", "|", "*", "[", "]", "(", ")", ";", eol)
def rules = rep1sep(rule, eol) ^^ { Rules(_) }
def rule = id ~ "=" ~ repsep(expression, "|") ^^ flatten3 { (e1: ID, _: Any, e3: List[Expression]) => Rule(e1, e3) }
def expression: Parser[Expression] = (element | parenthesized | optional) ^^ { x => x } // and sequence and repetition, but that's another problem...
def parenthesized: Parser[Expression] = "(" ~> expression <~ ")" ^^ { x => x }
def optional: Parser[Expression] = "[" ~> expression <~ "]" ^^ { Optional(_) }
def element: Parser[Element] = (id | constant) ^^ { x => x }
def constant: Parser[Constant] = stringLit ^^ { Constant(_) }
def id: Parser[ID] = ident ^^ { ID(_) }
def parse(text: String): Option[Rules] = {
val s = rules(new lexical.Scanner(text))
s match {
case Success(res, next) => {
println("Success!\n" + res.toString)
Some(res)
}
case Error(msg, next) => {
println("error: " + msg)
None
}
case Failure(msg, next) => {
println("failure: " + msg)
None
}
}
}
}
这些是您在代码的前一部分中缺少的类:
object ParserClasses {
abstract class Element extends Expression
case class ID(value: String) extends Element {
override def toString(): String = value
}
case class Constant(value: String) extends Element {
override def toString(): String = value
}
abstract class Expression
case class Optional(value: Expression) extends Expression {
override def toString() = s"[$value]"
}
case class Rule(head: ID, body: List[Expression]) {
override def toString() = s"$head = ${body.mkString(" | ")}"
}
case class Rules(rules: List[Rule]) {
override def toString() = rules.mkString("\n")
}
}
问题是:由于代码现在是,它不起作用,它只解析一个规则(不是两个)。如果我将eol
替换为";"
(在主要和解析器中),那么它是有效的(至少对于此测试而言)。
大多数人似乎更喜欢正则表达式解析器,每个解释解析器组合器的博客都没有详细了解可以扩展或不扩展的特性,所以我不知道这些差异或为什么有几个(我说这是因为理解为什么代码不起作用可能很重要。问题是:如果我尝试使用正则表达式解析器,那么我在解析器"="
,"*"
等中指定的所有字符串都会出错。