我正在使用Scala处理数据丰富的文本行,其中的一个示例是:
0101 Test A123456-7 N Ag Ri R 123 Im K8 V
为了解析这个问题,我已经移植了我在其他语言中使用的正则表达式。但是,我做错了什么。我的错误对象是:
object UwpParser extends App
{
val Pattern = "^(\\d\\d\\d\\d) (\\S.+) ([ABCDEX]\\d\\d\\d\\d\\d\\d-\\d) (..)\\s*(\\w.{17}) (.) (\\d\\d\\d) (\\w\\w) (.*)$".r;
var data = scala.io.Source.fromFile( "test.txt" ).getLines.mkString;
for (p <- Pattern findAllIn data) p match
{
case Pattern(c) => println( c )
case _ => None
}
}
for block的目的只是为了查看我是否已捕获了我的数据。显然我没有。我确定我做了很多错事。我已经搜索过堆栈溢出,但问题似乎与此不同,或者有一些我没有得到的东西。
更新即可。感谢发布scaladoc参考的人!我更正的代码是:
object UwpParser extends App
{
val Pattern = """^(\d\d\d\d) (\S.+) ([ABCDEX]\d\d\d\d\d\d-\d) (..)\s*(\w.{17}) (.) (\d\d\d) (\w\w) (.*)$""".r;
var data = scala.io.Source.fromFile( "test.txt" ).getLines.mkString;
data match {
case Pattern(hex, name, uwp, bases, codes, zone, pbg, alleg, stellar) => println( s"$name ($hex) $uwp" );
}
}
答案 0 :(得分:2)
最近一夜有澄清的scaladoc:
http://www.scala-lang.org/files/archive/nightly/2.11.x/api/2.11.x/#scala.util.matching.Regex
有很多模式匹配中捕获组的例子。
我希望这个版本的文档更容易阅读。
另外,您打算不要将{regex]与data
的每一行匹配?
val p = """your regex""".r
for (line <- text.getLines) {
line match {
case p(field1, field2, field3, _*) => // do something with first 3 capturing groups
}
}
而不是粘合和解开输入。
只是为了好玩和完整:
scala> val text = "Now is the time\nfor all good men\nto come home for dinner."
text: String =
Now is the time
for all good men
to come home for dinner.
scala> val r = """(?m)^(\S+)\s*(.*)$""".r
r: scala.util.matching.UnanchoredRegex = (?m)^(\S+)\s*(.*)$
scala> r findAllMatchIn text map (_ group 1) toList
warning: there was one feature warning; re-run with -feature for details
res0: List[String] = List(Now, for, to)
scala> r findAllMatchIn text map { case r(first, rest) => s"$first! ($rest)" } toList
warning: there was one feature warning; re-run with -feature for details
res1: List[String] = List(Now! (is the time), for! (all good men), to! (come home for dinner.))
实际上,这是为了提醒自己内联标志是什么。这是多线的m
。