我正在尝试使用regex.findAllMatchIn和Iterator [Match]匹配下面的支撑文本。下面的代码显示在某些情况下matchesOne的长度为非零,但后来表示它是一个空的迭代器。我觉得我在这里遗漏了一些基本的东西。有什么想法吗?
import scala.util.matching.Regex.Match
import scala.xml._
val xmldata = <document>
<content>
<headers>
</headers>
<body>
Foo [1], then another foo[2]; then lots of other things here
And add a few other lines[2][3] of test data[3][5] (Foo 1234)
</body>
</content>
</document>
val bodyIterator : Iterator[String]= ((xmldata \ "content" \ "body").text).linesWithSeparators
while (bodyIterator.hasNext) {
val line = bodyIterator.next()
println(s"***** Line is: $line")
val citationOne = """(\[[0-9]+\])(,\[[0-9]+\])*""".r
val citationTwo = """(\([A-Z, -.]+[0-9]{4}\))""".r
/* search the line for citations */
val matchesOne: Iterator[Match] = citationOne.findAllMatchIn(line)
val matchesTwo: Iterator[Match] = citationTwo.findAllMatchIn(line)
println("matchesOne found: " + matchesOne.length)
println("matchesTwo found: " + matchesTwo.length)
for (m <- matchesOne) {println(s"match is $m")}
println("matchesOne Matches: ")
matchesOne.foreach(x => println("1: " + x.matched))
//while (matchesOne.hasNext) {
// println("matchesOne: " + matchesOne.next())
// }
while (matchesTwo.hasNext) {
println("matchesTwo: " + matchesTwo.next().matched)
}
println("\n\n")
}
输出:
import scala.util.matching.Regex.Match
import scala.xml._
xmldata: scala.xml.Elem = <document>
<content>
<headers>
</headers>
<body>
Foo [1], then another foo[2]; then lots of other things here
And add a few other lines[2][3] of test data[3][5] (Foo 1234)
</body>
</content>
</document>
bodyIterator: Iterator[String] = non-empty iterator
***** Line is:
matchesOne found: 0
matchesTwo found: 0
matchesOne Matches:
***** Line is: Foo [1], then another foo[2]; then lots of other things here
matchesOne found: 2
matchesTwo found: 0
matchesOne Matches:
***** Line is: And add a few other lines[2][3] of test data[3][5] (Foo 1234)
matchesOne found: 4
matchesTwo found: 0
matchesOne Matches:
***** Line is:
matchesOne found: 0
matchesTwo found: 0
答案 0 :(得分:5)
如Iterator.length
中所述,致电Iterator
耗尽documentation:
注意 - 重用:在调用此方法之后,应该丢弃它被调用的迭代器。
答案 1 :(得分:3)
计算迭代器的长度会消耗它(因为它必须处理所有元素以查看它的长度)。因此,在知道长度之后,迭代器现在是空的 !
答案 2 :(得分:1)
当你得到迭代器的长度时,你已经在它的末尾,所以你不能在之后得到任何数据。在您的情况下,解决方案是将其转换为List。
val matchesOne: List[Match] = citationOne.findAllMatchIn(line).toList
val matchesTwo: List[Match] = citationTwo.findAllMatchIn(line).toList
然后你会得到预期的输出,例如:
scala> val line = "Foo [1], then another foo[2]; then lots of other things here"
line: String = Foo [1], then another foo[2]; then lots of other things here
scala> val result = citationOne.findAllMatchIn(line).toList
result: List[scala.util.matching.Regex.Match] = List([1], [2])
scala> val matchesOne = citationOne.findAllMatchIn(line).toList
matchesOne: List[scala.util.matching.Regex.Match] = List([1], [2])
scala> println("matchesOne found: " + matchesOne.length)
matchesOne found: 2
scala> for (m <- matchesOne) {println(s"match is $m")}
match is [1]
match is [2]