从使用Scala Parser Combinators编写的解析器返回有意义的错误消息

时间:2010-12-11 21:11:02

标签: parsing scala error-handling combinators

我尝试使用Parser Combinators在scala中编写解析器。如果我递归匹配,

def body: Parser[Body] =
("begin" ~> statementList  )  ^^ {
     case s => {   new Body(s); }
}

def statementList : Parser[List[Statement]] = 
  ("end" ^^ { _ => List() } )|
  (statement ~ statementList ^^ { case statement ~ statementList => statement :: statementList  })

然后,只要语句出错,我就会得到很好的错误消息。 但是,这是一个丑陋的长代码。所以我想写一下:

def body: Parser[Body] =
("begin" ~> statementList <~ "end"  )  ^^ {
   case s => {   new Body(s); }
}

def statementList : Parser[List[Statement]] = 
    rep(statement)

此代码有效,但只有在FIRST语句中出现错误时才会打印有意义的消息。如果它在后面的语句中,则消息变得非常难以使用,因为解析器希望看到整个错误语句被“end”标记替换:

Exception in thread "main" java.lang.RuntimeException: [4.2] error: "end" expected but "let" found

 let b : string = x(3,b,"WHAT???",!ERRORHERE!,7 ) 

 ^ 

我的问题:有没有办法让 rep repsep 与有意义的错误消息结合使用,将插入符号放在正确的位置而不是在开头重复片段?

2 个答案:

答案 0 :(得分:1)

啊,找到了解决办法!事实证明,您需要使用主解析器上的函数短语来返回不太倾向于追溯的新解析器。 (我想知道它究竟意味着什么,或许如果找到一个换行符就不会追溯?)跟踪发生故障时的最后一个位置。

改变:

def parseCode(code: String): Program = {
 program(new lexical.Scanner(code)) match {
      case Success(program, _) => program
      case x: Failure => throw new RuntimeException(x.toString())
      case x: Error => throw new RuntimeException(x.toString())
  }

}

def program : Parser[Program] ...

成:

def parseCode(code: String): Program = {
 phrase(program)(new lexical.Scanner(code)) match {
      case Success(program, _) => program
      case x: Failure => throw new RuntimeException(x.toString())
      case x: Error => throw new RuntimeException(x.toString())
  }

}


def program : Parser[Program] ...

答案 1 :(得分:1)

您可以通过将“自制”rep方法与非回溯内部语句相结合来实现。例如:

scala> object X extends RegexParsers {
     |   def myrep[T](p: => Parser[T]): Parser[List[T]] = p ~! myrep(p) ^^ { case x ~ xs => x :: xs } | success(List())
     |   def t1 = "this" ~ "is" ~ "war"
     |   def t2 = "this" ~! "is" ~ "war"
     |   def t3 = "begin" ~ rep(t1) ~ "end"
     |   def t4 = "begin" ~ myrep(t2) ~ "end"
     | }
defined module X

scala> X.parse(X.t4, "begin this is war this is hell end")
res13: X.ParseResult[X.~[X.~[String,List[X.~[X.~[String,String],String]]],String]] =
[1.27] error: `war' expected but ` ' found

begin this is war this is hell end
                          ^

scala> X.parse(X.t3, "begin this is war this is hell end")
res14: X.ParseResult[X.~[X.~[String,List[X.~[X.~[String,String],String]]],String]] =
[1.19] failure: `end' expected but ` ' found

begin this is war this is hell end
                  ^