懒惰地将带有单词的String转换为单词串

时间:2014-07-29 15:10:20

标签: scala

给定一个带有单词和空格的字符串,例如“aaa bbb ccc ddd”,你可以懒惰地将它转换为用白色空格分割字符串的流,例如Stream(“aaa”,???)?是否首先需要创建迭代器?

2 个答案:

答案 0 :(得分:2)

它可能是一种狡猾的解决方案,但可能适合您的需求。您可以利用java.util.Scanner也是迭代器的事实(尽管它是一个java迭代器)

import java.util.Scanner
import scala.collection.JavaConverters._

val str = "aaa bbb ccc ddd"
val tokenizer = new Scanner(str).useDelimiter(" ")
// this will trigger conversion from java Iterator to Scala one,
// can be written in more explicit way
val it: Iterator[String] = tokenizer
// it: Iterator[String] = non-empty iterator
val stream = it.toStream
// scala.collection.immutable.Stream[String] = Stream(aaa, ?)

以上代码可以写成近一行:

import scala.collection.JavaConverters._
import java.util.Scanner

val stream = new Scanner(str).useDelimiter(" ").asScala.toStream

答案 1 :(得分:0)

这是一个在查看scaladocs后应该有效的解决方案。为了模仿字符串的split(regex: String, limit: Int)函数,边缘情况需要单独处理,例如1和0。

  def wordStream(s: String): Stream[String] = {
    def loop(offset: Int): Stream[String] = {
      val substring = s.substring(offset)
      val preWhitespace = substring.takeWhile(_.isWhitespace).size
      val word = s.substring(preWhitespace).takeWhile(c => !c.isWhitespace)

      word #:: {
        val newOffset = offset + preWhitespace + word.length
        if (newOffset >= raw.length) Stream.empty
        else loop(newOffset)
      }
    }
    loop(0)
  }