scala未来运行顺序作业

时间:2018-02-21 11:42:24

标签: scala future

我尝试按顺序启动三个作业,但是当我尝试使用此代码时:

val jobs = Seq("stream.Job1","stream.Job2","stream.Job3")
    Future.sequence {
          jobs.map { jobClass =>
            Future {
              println(s"Starting the spark job from class $jobClass...")
              % gcloud("sparkC", "jobs", "submit", "spark", s"--cluster=$clusterName", s"--class=$jobClass", "--region=global", s"--jars=$JarFile")
              println(s"Starting the spark job from class $jobClass...DONE")
            }
          }
        }  

我并行地完成了三个工作,然后顺序完成。 我认为解决方案是与flatMap合作但我无法实现它 请帮忙。

2 个答案:

答案 0 :(得分:2)

试试这个

val jobs = Seq("stream.Job1","stream.Job2","stream.Job3")
jobs.foldLeft(Future.successful[Unit]()) {
  case (result, jobClass) =>
    result.flatMap[Unit] {_ =>
      Future {
        println(s"Starting the spark job from class $jobClass...")
        % gcloud("sparkC", "jobs", "submit", "spark", s"--cluster=$clusterName", s"--class=$jobClass", "--region=global", s"--jars=$JarFile")
        println(s"Starting the spark job from class $jobClass...DONE")
      }
    }.
      recoverWith {
      case NonFatal(e) => result
    }
}

这将迭代你的工作,并在上一次完成后立即运行。我添加了recoverWith块来独立处理所有Futures,如果其中任何一个失败

答案 1 :(得分:1)

如果作业不相互依赖,并且您希望获得结果列表 最后,您可以使用:

import scala.concurrent._
def runIndependentSequentially[X]
  (futs: List[() => Future[X]])
  (implicit ec: ExecutionContext): Future[List[X]] = futs match {
  case Nil => Future { Nil }
  case h :: t => for {
    x <- h()
    xs <- runIndependentSequentially(t)
  } yield x :: xs
}

现在您可以在您的工作未来列表中使用它,如下所示:

import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.duration._
import scala.language.postfixOps

val jobs = List("stream.Job1","stream.Job2","stream.Job3")
val futFactories = jobs.map { jobClass =>
  () => Future {
    println(s"Starting the spark job from class $jobClass...")
    Thread.sleep(5000)
    "result[" + jobClass + "," + (System.currentTimeMillis / 1000) % 3600 + "]"
  }
}

println(Await.result(runIndependentSequentially(futFactories), 30 seconds))

这会产生以下输出:

Starting the spark job from class stream.Job1...
Starting the spark job from class stream.Job2...
Starting the spark job from class stream.Job3...
List(result[stream.Job1,3011], result[stream.Job2,3016], result[stream.Job3,3021])

更新:用List[() => Future[X]]替换期货清单,以便。{ 即使在论证传递给期货之前,期货的评估也没有开始 runIndependentSequentially方法。非常感谢@Evgeny指出它!