如何在StandardTokenParsers中定义正则表达式来识别路径?

时间:2015-06-15 01:32:30

标签: regex scala parsing lexical-analysis

我正在编写一个解析器,我想在其中解析算术表达式,如: /hdfs://xxx.xx.xx.x:xxxx/path1/file1.jpg+1 我想解析它将中缀更改为postfix并进行计算。我也使用了code in another discussion部分的帮助。

 class InfixToPostfix extends StandardTokenParsers {
 import lexical._

 def regexStringLit(r: Regex): Parser[String] = acceptMatch(
 "string literal matching regex " + r,
 { case  StringLit(s)  if r.unapplySeq(s).isDefined => s })
 def pathIdent: Parser[String] =regexStringLit("/hdfs://([\d\.]+):(\d+)/([\w/]+/(\w+\.\w+))".r)
 lexical.delimiters ++= List("+","-","*","/", "^","(",")",",")
 def value :Parser[Expr] = numericLit ^^ { s => Number(s) }
def variable:Parser[Expr] =  pathIdent ^^ { s => Variable(s) }
def parens:Parser[Expr] = "(" ~> expr <~ ")"

def argument:Parser[Expr] = expr <~ (","?)
def func:Parser[Expr] = ( pathIdent ~ "(" ~ (argument+) ~ ")" ^^ { case f ~ _ ~ e ~ _ => Function(f, e) })

def term = (value | parens | func | variable)

// Needed to define recursive because ^ is right-associative
def pow :Parser[Expr] = ( term ~ "^" ~ pow ^^ {case left ~ _ ~ right => BinaryOperator(left, "^", right) }|
            term)
def factor = pow * ("*" ^^^ { (left:Expr, right:Expr) => BinaryOperator(left, "*", right) } |
                    "/" ^^^ { (left:Expr, right:Expr) => BinaryOperator(left, "/", right) } )
def sum =  factor * ("+" ^^^ { (left:Expr, right:Expr) => BinaryOperator(left, "+", right) } |
                    "-" ^^^ { (left:Expr, right:Expr) => BinaryOperator(left, "-", right) } )
def expr = ( sum | term )

def parse(s:String) = {

   val tokens = new lexical.Scanner(s)
    phrase(expr)(tokens)
}

//以及代码的其余部分

我能够在this answer的帮助下解决以下错误:

      ScalaParser.scala:192: invalid escape character
  [error]     def pathIdent: Parser[String] =regexStringLit("/hdfs://([\d\.]+):(\d+)/([\w/]+/(\w+\.\w+))".r)
  [error]                                                               ^
  [error] ScalaParser.scala:192: invalid escape character
  [error]     def pathIdent: Parser[String] =regexStringLit("/hdfs://([\d\.]+):(\d+)/([\w/]+/(\w+\.\w+))".r)
   [error]                                                                ^
   [error] ScalaParser.scala:192: invalid escape character
   [error]     def pathIdent: Parser[String] =regexStringLit("/hdfs://([\d\.]+):(\d+)/([\w/]+/(\w+\.\w+))".r)
   [error]                                                                        ^

随着路径改变:

  def pathIdent: Parser[String] =regexStringLit("/hdfs://([\\d.]+):(\\d+)/([\\w/]+/(\\w+\\.w+))".r)

现在我收到运行时错误,上面写着:

 [1.1] failure: string literal matching regex /hdfs://([\d\.]+):(\d+)/([\w/]+/(\w+\.\w+)) expected

/hdfs://111.33.55.2:8888/folder1/p.a3d+1
^

它正在使用JavaTokenParsers但是当前的更改我不得不使用StandardTokenParsers。

1 个答案:

答案 0 :(得分:1)

在双引号字符串中,反斜杠是一个转义字符。如果您想在双引号字符串中使用文字反斜杠,则必须将其转义,因此"\d"应为"\\d"

此外,您不需要在字符类中转义正则表达式点,因为点对于字符类没有特殊含义。所以"[\d.]"应该是&#34; [\ d。]&#34;。

您还可以使用原始插值器或使用三引号的多行字符串文字来放弃所有这些转义业务。