Question

在厌倦了正则表达式后，我一直在尝试使用scala的解析器组合库作为正则表达式的更直观的替代品。但是，当我想在一个字符串中搜索一个模式并忽略它之前发生的事情时，我遇到了一个问题，例如，如果我想检查一个字符串是否包含单词“octopus”，我可以做类似

val r = "octopus".r
r.findFirstIn("www.octopus.com")

正确地给出了Some(octopus)。

但是，使用解析器组合器

import scala.util.parsing.combinator._
object OctopusParser extends RegexParsers {

  def any = regex(".".r)*
  def str = any ~> "octopus" <~ any

  def parse(s: String) = parseAll(str, s) 
}

OctopusParser.parse("www.octopus.com")

但是我在这个

上收到错误

scala> OctopusParser.parse("www.octopus.com")
res0: OctopusParser.ParseResult[String] = 
[1.16] failure: `octopus' expected but end of source found

www.octopus.com

有没有一个很好的方法来实现这一目标？从玩耍开始，any似乎吞下了太多的输入。

Answer 1

问题是你的'any'解析器是贪婪的，所以它匹配整行，没有留下任何'str'来解析。

您可能想要尝试以下内容：

object OctopusParser extends RegexParsers {

  def prefix = regex("""[^\.]*\.""".r) // Match on anything other than a dot and then a dot - but only the once
  def postfix = regex("""\..*""".r)* // Grab any number of remaining ".xxx" blocks
  def str = prefix ~> "octopus" <~ postfix

  def parse(s: String) = parseAll(str, s)
}

然后给了我：

scala> OctopusParser.parse("www.octopus.com")
res0: OctopusParser.ParseResult[String] = [1.13] parsed: octopus

你可能需要使用'prefix'来匹配你期望的输入范围，并且可能想要使用'？'懒惰的标记如果它太贪心了。

忽略解析器组合器中的任意前缀

1 个答案: