在编写Regexes时,我发现scala的"""
语法非常方便,因为我可以在一个新行中逐步编写我的正则表达式
例如:
val foo =
"""
(
|(
|\d{3}
||
|\(\d{3}\)
|)?
|(
|\s|-|\.
|)?
|\d{3}
|(\s|-|\.)
|\d{4}
|(
|\s*
|(
|ext|x|extn|extn.
|)
|\s*
|\d{2,6}
|)?
|)""".stripMargin.replace("\n", "").r
但是我希望我可以写一些评论来解释我在每一行中所做的事情,比如
val foo =
"""( // start group to capture the phone number
|( // start of optional area code choices
|\d{3} // bare three digits
|| // or
|\(\d{3}\) // three digits enclosed in parentheses
|)? // end of optional area code choices
|( // start of optional separator
|\s|-|\. // start of optional separator
|)? // separator can be whitespace, dash or period
|\d{3} // exchange number (required)
|(\s|-|\.) // same separator but required this time
|\d{4} // final digits (required)
|( // start of optional extension
|\s* // zero or more characters of white space
|( // start of extention indicator
|ext.|x.|ext.|extn. // extention can be indicated by "ext", "x", or extn followed by any character
|) // end of extension indicator
|\s* // zero or more characters of white space
|\d{2,6} // two to five digits of extension number
|)? // end of optional estension
|)""".stripMargin.replace("\n", "").trim
println(foo)
val regex = foo.r
val input = "(888)-456-7890 extn: 12345"
regex.findAllIn(input).foreach(println)
但是scala使注释成为字符串本身的一部分。那么如何在 python
中编写注释和多行字符串verboseRegex = re.compile(r'''
( # start group to capture the phone number
( # start of optional area code choices
\d{3} # bare three digits
| # or
\(\d{3}\) # three digits enclosed in parentheses
)? # end of optional area code choices
( # start of optional separator
\s|-|\. # separator can be whitespace, dash or period
)? # end of optional separator
\d{3} # exchange number (required)
(\s|-|\.) # same separator but required this time
\d{4} # final digits (required)
( # start of optional extension
\s* # zero or more characters of white space
( # start of extention indicator
ext|x|ext. # extention can be indicated by "ext", "x", or
# "ext" followed by any character
) # end of extension indicator
\s* # zero or more characters of white space
\d{2,5} # two to five digits of extension number
)? # end of optional estension
) # end phone number capture group
''', re.VERBOSE)
所以在上面的python代码中我们使用'''
,它看起来像我们的scala """
,但我们也可以写评论。
答案 0 :(得分:2)
显然,(?x)
支持忽略空格和注释:
scala> val r = """(?x)abc
| # works ok
| def""".r
r: scala.util.matching.Regex =
(?x)abc
# works ok
def
scala> "abcdef" match { case r(_*) => }
scala> val r = s"""(?x)abc\n |def #works, I hope\n |123""".stripMargin.r
r: scala.util.matching.Regex =
(?x)abc
def #works, I hope
123
scala> "abcdef123" match { case r(_*) => }
另一个想法:
scala> val r = s"abc${ "" // comment this
| }def${ "" // not pretty
| }".r
r: scala.util.matching.Regex = abcdef
scala> "abcdef" match { case r(_*) => }
在这些漏洞中返回空字符串comment"interpolator"
可能会很方便。
scala> val r = s"abc${ comment"empty words here" }".r
如果你忽略捕获群体,那么额外的麻烦不是一件好事:
scala> val r = s"abc${ // comment
| }".r
r: scala.util.matching.Regex = abc()
scala> "abc" match { case r(_*) => }
插入单位而不是空字符串太糟糕了。