Scala三重报价字符串和注释

时间:2016-08-28 00:46:07

标签: scala

在编写Regexes时,我发现scala的"""语法非常方便,因为我可以在一个新行中逐步编写我的正则表达式

例如:

val foo =
"""
  (
  |(
  |\d{3}
  ||
  |\(\d{3}\)
  |)?
  |(
  |\s|-|\.
  |)?
  |\d{3}
  |(\s|-|\.)
  |\d{4}
  |(
  |\s*
  |(
  |ext|x|extn|extn.
  |)
  |\s*
  |\d{2,6}
  |)?
  |)""".stripMargin.replace("\n", "").r

但是我希望我可以写一些评论来解释我在每一行中所做的事情,比如

  val foo =
     """(                       // start group to capture the phone number
       |(                       // start of optional area code choices
       |\d{3}                   // bare three digits
       ||                       // or
       |\(\d{3}\)               // three digits enclosed in parentheses
       |)?                      // end of optional area code choices
       |(                       // start of optional separator
       |\s|-|\.                 // start of optional separator
       |)?                      // separator can be whitespace, dash or period
       |\d{3}                   // exchange number (required)
       |(\s|-|\.)               // same separator but required this time
       |\d{4}                   // final digits (required)
       |(                       // start of optional extension
       |\s*                     // zero or more characters of white space
       |(                       // start of extention indicator
       |ext.|x.|ext.|extn.      // extention can be indicated by "ext", "x", or extn followed by any character
       |)                       // end of extension indicator
       |\s*                     // zero or more characters of white space
       |\d{2,6}                 // two to five digits of extension number
       |)?                      //  end of optional estension
       |)""".stripMargin.replace("\n", "").trim
  println(foo)
  val regex = foo.r
  val input = "(888)-456-7890 extn: 12345"
  regex.findAllIn(input).foreach(println)

但是scala使注释成为字符串本身的一部分。那么如何在 python

中编写注释和多行字符串
verboseRegex = re.compile(r'''
    (             # start group to capture the phone number
    (             #  start of optional area code choices
    \d{3}         #   bare three digits
    |             #   or
    \(\d{3}\)     #   three digits enclosed in parentheses
    )?            #  end of optional area code choices
    (             #  start of optional separator
    \s|-|\.       #   separator can be whitespace, dash or period
    )?            #  end of optional separator
    \d{3}         #  exchange number (required)
    (\s|-|\.)     #  same separator but required this time
    \d{4}         #  final digits (required)
    (             #  start of optional extension
    \s*           #   zero or more characters of white space
    (             #   start of extention indicator
    ext|x|ext.    #    extention can be indicated by "ext", "x", or
                  #      "ext" followed by any character
    )             #   end of extension indicator
    \s*           #   zero or more characters of white space
    \d{2,5}       #   two to five digits of extension number
    )?            #  end of optional estension
    )             # end phone number capture group
    ''', re.VERBOSE)

所以在上面的python代码中我们使用''',它看起来像我们的scala """,但我们也可以写评论。

1 个答案:

答案 0 :(得分:2)

显然,(?x)支持忽略空格和注释:

scala> val r = """(?x)abc
     | # works ok
     | def""".r
r: scala.util.matching.Regex =
(?x)abc
# works ok
def

scala> "abcdef" match { case r(_*) => }

scala> val r = s"""(?x)abc\n  |def  #works, I hope\n    |123""".stripMargin.r
r: scala.util.matching.Regex =
(?x)abc
def  #works, I hope
123

scala> "abcdef123" match { case r(_*) => }

另一个想法:

scala> val r = s"abc${ "" // comment this
     | }def${  "" // not pretty
     | }".r
r: scala.util.matching.Regex = abcdef

scala> "abcdef" match { case r(_*) => }

在这些漏洞中返回空字符串comment"interpolator"可能会很方便。

scala> val r = s"abc${ comment"empty words here" }".r

如果你忽略捕获群体,那么额外的麻烦不是一件好事:

scala> val r = s"abc${ // comment
     | }".r
r: scala.util.matching.Regex = abc()

scala> "abc" match { case r(_*) => }

插入单位而不是空字符串太糟糕了。