为什么,在Scala中,我的迭代器[匹配]给出一个长度但没有数据?

时间:2014-09-23 20:19:41

标签: scala

我正在尝试使用regex.findAllMatchIn和Iterator [Match]匹配下面的支撑文本。下面的代码显示在某些情况下matchesOne的长度为非零,但后来表示它是一个空的迭代器。我觉得我在这里遗漏了一些基本的东西。有什么想法吗?

  import scala.util.matching.Regex.Match
  import scala.xml._

  val xmldata = <document>
    <content>
      <headers>
      </headers>
      <body>
        Foo [1], then another foo[2]; then lots of other things here
        And add a few other lines[2][3] of test data[3][5] (Foo 1234)
      </body>
    </content>
   </document>

  val bodyIterator : Iterator[String]= ((xmldata \ "content" \ "body").text).linesWithSeparators

  while (bodyIterator.hasNext) {
    val line = bodyIterator.next()

    println(s"*****   Line is: $line")

    val citationOne = """(\[[0-9]+\])(,\[[0-9]+\])*""".r
    val citationTwo = """(\([A-Z, -.]+[0-9]{4}\))""".r
    /* search the line for citations */

    val matchesOne: Iterator[Match] = citationOne.findAllMatchIn(line)
    val matchesTwo: Iterator[Match] = citationTwo.findAllMatchIn(line)

    println("matchesOne found: " + matchesOne.length)
    println("matchesTwo found: " + matchesTwo.length)
    for (m <- matchesOne) {println(s"match is $m")}

    println("matchesOne Matches: ")
    matchesOne.foreach(x => println("1: " + x.matched))
    //while (matchesOne.hasNext) {
    // println("matchesOne: " + matchesOne.next())
    // }

    while (matchesTwo.hasNext) {
      println("matchesTwo: " + matchesTwo.next().matched)
    }

    println("\n\n")
  }

输出:

  import scala.util.matching.Regex.Match
  import scala.xml._

  xmldata: scala.xml.Elem = <document>
    <content>
      <headers>
      </headers>
      <body>
        Foo [1], then another foo[2]; then lots of other things here
        And add a few other lines[2][3] of test data[3][5] (Foo 1234)
      </body>
      </content>
     </document>

  bodyIterator: Iterator[String] = non-empty iterator

  *****   Line is: 

  matchesOne found: 0
  matchesTwo found: 0
  matchesOne Matches: 



  *****   Line is:       Foo [1], then another foo[2]; then lots of other things here

  matchesOne found: 2
  matchesTwo found: 0
  matchesOne Matches: 



  *****   Line is:       And add a few other lines[2][3] of test data[3][5] (Foo 1234)

  matchesOne found: 4
  matchesTwo found: 0
  matchesOne Matches: 



  *****   Line is:     
  matchesOne found: 0
  matchesTwo found: 0 

3 个答案:

答案 0 :(得分:5)

Iterator.length中所述,致电Iterator耗尽documentation

  

注意 - 重用:在调用此方法之后,应该丢弃它被调用的迭代器。

答案 1 :(得分:3)

计算迭代器的长度会消耗它(因为它必须处理所有元素以查看它的长度)。因此,在知道长度之后,迭代器现在是空的 !

答案 2 :(得分:1)

当你得到迭代器的长度时,你已经在它的末尾,所以你不能在之后得到任何数据。在您的情况下,解决方案是将其转换为List。

   val matchesOne: List[Match] = citationOne.findAllMatchIn(line).toList
   val matchesTwo: List[Match] = citationTwo.findAllMatchIn(line).toList

然后你会得到预期的输出,例如:

scala> val line = "Foo [1], then another foo[2]; then lots of other things here"
line: String = Foo [1], then another foo[2]; then lots of other things here

scala> val result = citationOne.findAllMatchIn(line).toList
result: List[scala.util.matching.Regex.Match] = List([1], [2])

scala> val matchesOne = citationOne.findAllMatchIn(line).toList
matchesOne: List[scala.util.matching.Regex.Match] = List([1], [2])

scala> println("matchesOne found: " + matchesOne.length)
matchesOne found: 2

scala> for (m <- matchesOne) {println(s"match is $m")}
match is [1]
match is [2]