我在Scala 2.12.x中编写了一个小型网站Google排名检查程序,使用页面抓取功能来查找给定搜索词的网站排名。我想使用Scala的Stream构建它,这是代码的控制结构模拟。但是,我找不到一种没有副作用的重写方法,换句话说,就是不使用任何var
。
def main(args: Array[String]): Unit = {
val target = 22 // normally this would be the website domain name
val inf = 100 // we don't care for ranks above this value
var result: Option[Int] = None // <============= Side effects! how to rewrite it?
Stream.iterate(0)(_ + 10).takeWhile { i =>
// assume I'm page-scraping Google with 10 results per page
// and need to find the rank or position where the target
// website appears
for (j <- i until (i + 10)) {
// check whether the website was found
if (j == target) {
result = Some(j) // <============= Side effects! how to rewrite it?
}
}
result.isEmpty && i < inf
}.toList
println(result.getOrElse(inf))
}
基本上,我希望Stream
语句直接向我返回result
,这是目标网站出现的位置或排名。我无法一遍又一遍地进行迭代,因为该代码一次获取了10个结果的每一页,将它们分页并在每10个结果的组中搜索目标网站。
答案 0 :(得分:3)
您可以将管道分成map
和dropWhile
(已替换takeWhile
):
val target = 22 // normally this would be the website domain name
val inf = 100 // we don't care for ranks above this value
val result = Stream.iterate(0)(_ + 10).map { i =>
//or maybe just use find?
val r = Stream.range(i-10, i).dropWhile(_ != target).headOption
(r,i) //we pass result with index for dropWhile
}.dropWhile{
case (r, i) => r.isEmpty && i < inf //drop while predicate is false
}.map(_._1) //take only result
.head //this will throw an exception if nothing is found, maybe use headOption?
您还应该知道,我只是摆脱了分配可变变量的麻烦,但是您的代码仍会产生副作用,因为您正在进行网络调用。
您应该考虑使用Future
或某种IO
monad来处理这些呼叫。