将与regex匹配的组提取到scala中的数组

时间:2017-05-11 08:54:55

标签: arrays regex scala extract regex-group

我遇到了这个问题。我有一个

val line:String = "PE018201804527901"

与此匹配

regex : (.{2})(.{4})(.{9})(.{2})

我需要将每个组从正则表达式提取到数组。

结果将是:

Array["PE", "0182","018045279","01"]

我尝试做这个正则表达式:

val regex =  """(.{2})(.{4})(.{9})(.{2})""".r
val x= regex.findAllIn(line).toArray

但它不起作用!

3 个答案:

答案 0 :(得分:3)

你的解决方案@sheunis非常有帮助,最后我用这个方法解决了它:

def extractFromRegex (regex: Regex, line:String): Array[String] = {
   val list =  ListBuffer[String]()
   for(m <- regex.findAllIn(line).matchData;
      e <- m.subgroups)
   list+=e
list.toArray

}

因为您使用此代码的解决方案:

val line:String = """PE0182"""
val regex ="""(.{2})(.{4})""".r  
val t = regex.findAllIn(line).subgroups.toArray

显示下一个例外:

Exception in thread "main" java.lang.IllegalStateException: No match available
at java.util.regex.Matcher.start(Matcher.java:372)
at scala.util.matching.Regex$MatchIterator.start(Regex.scala:696)
at scala.util.matching.Regex$MatchData$class.group(Regex.scala:549)
at scala.util.matching.Regex$MatchIterator.group(Regex.scala:671)
at scala.util.matching.Regex$MatchData$$anonfun$subgroups$1.apply(Regex.scala:553)
at scala.util.matching.Regex$MatchData$$anonfun$subgroups$1.apply(Regex.scala:553)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.immutable.List.foreach(List.scala:318)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at scala.util.matching.Regex$MatchData$class.subgroups(Regex.scala:553)
at scala.util.matching.Regex$MatchIterator.subgroups(Regex.scala:671)

答案 1 :(得分:2)

regex.findAllIn(line).subgroups.toArray

答案 2 :(得分:2)

请注意findAllIn不会自动锚定正则表达式模式,并且会在更长的字符串中找到匹配项。如果您只需要允许17个字符串中的匹配,您可以使用匹配块,如下所示:

val line = "PE018201804527901"
val regex =  """(.{2})(.{4})(.{9})(.{2})""".r
val results = line match {
  case regex(g1, g2, g3, g4) => Array(g1, g2, g3, g4)
  case _ => Array[String]()
}
// Demo printing
results.foreach { m =>
  println(m)
} 
// PE
// 0182
// 018045279
// 01

查看Scala demo

它还处理没有匹配场景,很好地初始化一个空字符串数组。

如果您需要获取所有匹配项和所有组,则需要将组抓取到列表中,然后将列表添加到列表缓冲区(scala.collection.mutable.ListBuffer):

val line = "PE018201804527901%E018201804527901"
val regex =  """(.{2})(.{4})(.{9})(.{2})""".r
val results = ListBuffer[List[String]]()

val mi = regex.findAllIn(line)
while (mi.hasNext) {
  val d = mi.next
  results += List(mi.group(1), mi.group(2), mi.group(3), mi.group(4))
}
// Demo printing
results.foreach { m =>
  println("------")
  println(m)
  m.foreach { l => println(l) }
}

结果:

------
List(PE, 0182, 018045279, 01)
PE
0182
018045279
01
------
List(%E, 0182, 018045279, 01)
%E
0182
018045279
01

请参阅this Scala demo