Question

我正在尝试为以下命令定义语法。

object ParserWorkshop {
    def main(args: Array[String]) = {
        ChoiceParser("todo link todo to database")
        ChoiceParser("todo link todo to database deadline: next tuesday context: app.model")
    }
}

第二个命令应该标记为：

action = todo
message = link todo to database
properties = [deadline: next tuesday, context: app.model]

当我在下面定义的语法上运行此输入时，收到以下错误消息：

[1.27] parsed: Command(todo,link todo to database,List())
[1.36] failure: string matching regex `\z' expected but `:' found

todo link todo to database deadline: next tuesday context: app.model
                                   ^

据我所知，它失败了，因为匹配消息的单词的模式几乎与属性键的键的模式相同：值对，因此解析器无法分辨消息的结束位置和属性开始。我可以通过坚持为每个属性使用开始标记来解决这个问题：

todo link todo to database :deadline: next tuesday :context: app.model

但我宁愿保持命令尽可能接近自然语言。我有两个问题：

错误信息实际上是什么意思？如何修改现有语法以适应给定的输入字符串？

import scala.util.parsing.combinator._

case class Command(action: String, message: String, properties: List[Property])
case class Property(name: String, value: String)

object ChoiceParser extends JavaTokenParsers {
    def apply(input: String) = println(parseAll(command, input))

    def command = action~message~properties ^^ {case a~m~p => new Command(a, m, p)}

    def action = ident

    def message = """[\w\d\s\.]+""".r

    def properties = rep(property)

    def property = propertyName~":"~propertyValue ^^ {
        case n~":"~v => new Property(n, v)
    }

    def propertyName: Parser[String] = ident

    def propertyValue: Parser[String] = """[\w\d\s\.]+""".r
}

Answer 1

这很简单。当您使用~时，您必须了解已成功完成的各个解析器没有回溯。

因此，例如，message在冒号之前得到了所有内容，因为所有这些都是可接受的模式。接下来，properties是rep property，需要propertyName，但它只会找到冒号（第一个字符不会被message吞噬）。因此propertyName失败，property失败。现在，properties，如上所述，是rep，因此它会成功完成0次重复，然后command成功完成。

所以，回到parseAll。 command解析器成功返回，消耗了冒号之前的所有内容。然后它提出了一个问题：我们是否在输入的末尾（\z）？不，因为接下来就有一个冒号。因此，它预计输入结束，但得到冒号。

您必须更改正则表达式，以便它不会消耗冒号前的最后一个标识符。例如：

def message = """[\w\d\s\.]+(?![:\w])""".r

顺便说一句，当你使用def时，你强制重新评估表达式。换句话说，每次调用每个defs都会创建一个解析器。每次处理它们所属的解析器时，都会实例化正则表达式。如果您将所有内容都更改为val，那么您的效果会更好。

请记住，这些东西定义解析器，它们不会运行它。运行解析器的是parseAll。

Scala Parser Token Delimiter问题

1 个答案: