我尝试使用scala的解析器组合解析字符串,如下所示:
import scala.util.parsing.combinator._
import scala.util.parsing.input.CharSequenceReader
object TestPackratParser extends RegexParsers with PackratParsers {
lazy val program: PackratParser[Any] = "start" ~ water ~ "end" ^^ (_ => println("program"))
lazy val water: PackratParser[Any] = (""".""".r).* ^^ (_ => println("water"))
def main(args: Array[String]) {
parseAll(phrase(program), new PackratReader(new CharSequenceReader("start something here end")))
}
}
我认为这应该是成功的,因为packrat解析器会回溯,所以" water"最终将匹配"这里的东西"。
然而,似乎" water"匹配"这里结束了什么"代替。我原本以为它不应该这样做。有办法解决吗?
答案 0 :(得分:2)
至于为什么packrat解析器没有回溯,请参阅此this SO question。说,获得你想要的东西的一种方法是:
object TestPackratParser extends RegexParsers with PackratParsers {
override val skipWhitespace = false
lazy val ws = """\s+""".r
lazy val program: PackratParser[Any] = "start" ~ ws ~ water ~ ws ~ "end" ^^ (_ => println("program"))
lazy val water: PackratParser[Any] = words ^^ (_ => println("water"))
val words = repsep("""\w+""".r, ws ~ not("end") ^^ { case _ => ""})
def main(args: Array[String]) {
parseAll(phrase(program), new PackratReader(new CharSequenceReader("start something here end")))
}
}
主要思想是在指定单词之间的分隔符时使用not
。只有当end
解析器不成功words
时才会成功。否则,program
解析器继续。