foreachAsync和foreach之间的区别?

时间:2017-11-07 08:52:44

标签: java apache-spark

问题

  1. 任何人都可以告诉我Spark中foreachAsyncforeach之间的区别吗?
  2. foreachAsync是否并行工作?
  3. 我在java中的代码示例

    rdds.foreach(new VoidFunction<>()){//some actions}; //it works
    rdds.foreachAsync(new VoidFunction<>()){//some actions}; //it fails
    

    错误日志

    17/11/07 16:42:38 WARN SparkConf: In Spark 1.0 and later spark.local.dir will be overridden by the value set by the cluster manager (via SPARK_LOCAL_DIRS in mesos/standalone and LOCAL_DIRS in YARN).
    17/11/07 16:42:40 WARN MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set.
    17/11/07 16:42:43 WARN LiveListenerBus: SparkListenerBus has already stopped! Dropping event SparkListenerJobStart(0,1510044163831,WrappedArray(org.apache.spark.scheduler.StageInfo@122f2c22, org.apache.spark.scheduler.StageInfo@c15154c),{spark.rdd.scope.noOverride=true, spark.rdd.scope={"id":"3","name":"foreachAsync"}})
    17/11/07 16:42:43 WARN LiveListenerBus: SparkListenerBus has already stopped! Dropping event SparkListenerStageSubmitted(org.apache.spark.scheduler.StageInfo@4b276b68,{spark.rdd.scope.noOverride=true, spark.rdd.scope={"id":"3","name":"foreachAsync"}})
    17/11/07 16:42:43 WARN LiveListenerBus: SparkListenerBus has already stopped! Dropping event SparkListenerStageCompleted(org.apache.spark.scheduler.StageInfo@4b276b68)
    17/11/07 16:42:43 WARN LiveListenerBus: SparkListenerBus has already stopped! Dropping event SparkListenerJobEnd(0,1510044163921,JobFailed(org.apache.spark.SparkException: Job aborted due to stage failure: Task serialization failed: java.lang.IllegalStateException: Cannot call methods on a stopped SparkContext
    org.apache.spark.SparkContext.org$apache$spark$SparkContext$$assertNotStopped(SparkContext.scala:105)
    org.apache.spark.SparkContext.broadcast(SparkContext.scala:1347)
    org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitMissingTasks(DAGScheduler.scala:873)
    org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:774)
    org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$submitStage$4.apply(DAGScheduler.scala:777)
    org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$submitStage$4.apply(DAGScheduler.scala:776)
    scala.collection.immutable.List.foreach(List.scala:318)
    org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:776)
    org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:759)
    org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1508)
    org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1500)
    org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1487)
    org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:72)
    

0 个答案:

没有答案