如何使用actor重写具有共享依赖关系的for循环

时间:2010-03-22 19:05:17

标签: scala actor akka

我们有一些代码需要运行得更快。它已经分析了,所以我们想要使用多个线程。通常我会设置一个内存队列,并有许多线程从事队列工作并计算结果。对于共享数据,我将使用ConcurrentHashMap或类似的。

我真的不想再去那条路了。从我所看到的使用演员将导致更清晰的代码,如果我使用akka迁移到超过1 jvm应该更容易。这是真的吗?

但是,我不知道如何在演员中思考,所以我不知道从哪里开始。

为了更好地了解问题,这里有一些示例代码:

case class Trade(price:Double, volume:Int, stock:String) {
  def value(priceCalculator:PriceCalculator) =
    (priceCalculator.priceFor(stock)-> price)*volume
}
class PriceCalculator {
  def priceFor(stock:String) = {
    Thread.sleep(20)//a slow operation which can be cached
    50.0
  }
}
object ValueTrades {

  def valueAll(trades:List[Trade],
      priceCalculator:PriceCalculator):List[(Trade,Double)] = {
    trades.map { trade => (trade,trade.value(priceCalculator)) }
  }

  def main(args:Array[String]) {
    val trades = List(
      Trade(30.5, 10, "Foo"),
      Trade(30.5, 20, "Foo")
      //usually much longer
    )
    val priceCalculator = new PriceCalculator
    val values = valueAll(trades, priceCalculator)
  }

}

如果有使用演员经验的人可以建议如何映射演员,我会很感激。

2 个答案:

答案 0 :(得分:3)

这是对我对昂贵计算的共享结果的评论的补充。这是:

import scala.actors._
import Actor._
import Futures._

case class PriceFor(stock: String) // Ask for result

// The following could be an "object" as well, if it's supposed to be singleton
class PriceCalculator extends Actor {
  val map = new scala.collection.mutable.HashMap[String, Future[Double]]()
  def act = loop {
    react {
      case PriceFor(stock) => reply(map getOrElseUpdate (stock, future {
        Thread.sleep(2000) // a slow operation
        50.0
      }))
    }
  }
}

这是一个用法示例:

scala> val pc = new PriceCalculator; pc.start
pc: PriceCalculator = PriceCalculator@141fe06

scala> class Test(stock: String) extends Actor {
     |   def act = {
     |     println(System.currentTimeMillis().toString+": Asking for stock "+stock)
     |     val f = (pc !? PriceFor(stock)).asInstanceOf[Future[Double]]
     |     println(System.currentTimeMillis().toString+": Got the future back")
     |     val res = f.apply() // this blocks until the result is ready
     |     println(System.currentTimeMillis().toString+": Value: "+res)
     |   }
     | }
defined class Test

scala> List("abc", "def", "abc").map(new Test(_)).map(_.start)
1269310737461: Asking for stock abc
res37: List[scala.actors.Actor] = List(Test@6d888e, Test@1203c7f, Test@163d118)
1269310737461: Asking for stock abc
1269310737461: Asking for stock def
1269310737464: Got the future back

scala> 1269310737462: Got the future back
1269310737465: Got the future back
1269310739462: Value: 50.0
1269310739462: Value: 50.0
1269310739465: Value: 50.0


scala> new Test("abc").start // Should return instantly
1269310755364: Asking for stock abc
res38: scala.actors.Actor = Test@15b5b68
1269310755365: Got the future back

scala> 1269310755367: Value: 50.0

答案 1 :(得分:2)

对于简单的并行化,我抛出一堆工作进行处理,然后等待它全部回来,我倾向于使用Futures模式。

class ActorExample {
  import actors._
  import Actor._
  class Worker(val id: Int) extends Actor {
    def busywork(i0: Int, i1: Int) = {
      var sum,i = i0
      while (i < i1) {
        i += 1
        sum += 42*i
      }
      sum
    }
    def act() { loop { react {
      case (i0:Int,i1:Int) => sender ! busywork(i0,i1)
      case None => exit()
    }}}
  }

  val workforce = (1 to 4).map(i => new Worker(i)).toList

  def parallelFourSums = {
    workforce.foreach(_.start())
    val futures = workforce.map(w => w !! ((w.id,1000000000)) );
    val computed = futures.map(f => f() match {
      case i:Int => i
      case _ => throw new IllegalArgumentException("I wanted an int!")
    })
    workforce.foreach(_ ! None)
    computed
  }

  def serialFourSums = {
    val solo = workforce.head
    workforce.map(w => solo.busywork(w.id,1000000000))
  }

  def timed(f: => List[Int]) = {
    val t0 = System.nanoTime
    val result = f
    val t1 = System.nanoTime
    (result, t1-t0)
  }

  def go {
    val serial = timed( serialFourSums )
    val parallel = timed( parallelFourSums )
    println("Serial result:  " + serial._1)
    println("Parallel result:" + parallel._1)
    printf("Serial took   %.3f seconds\n",serial._2*1e-9)
    printf("Parallel took %.3f seconds\n",parallel._2*1e-9)
  }
}

基本上,我们的想法是创建一个工作者集合 - 每个工作负载一个 - 然后将所有数据丢给它们!这会立即回馈未来。当您尝试阅读未来时,发件人将阻止,直到工作人员实际完成数据。

您可以重写上述内容,以便PriceCalculator扩展Actor,而valueAll协调数据的返回。

请注意,您必须小心传递不可变数据。

无论如何,在我正在输入的机器上,如果你运行上面的内容,你会得到:

scala> (new ActorExample).go
Serial result:  List(-1629056553, -1629056636, -1629056761, -1629056928)
Parallel result:List(-1629056553, -1629056636, -1629056761, -1629056928)
Serial took   1.532 seconds
Parallel took 0.443 seconds

(显然我至少有四个核心;并行时间根据工作人员获得什么处理器以及机器上发生的其他情况而有所不同。)