[Java] org.apache.spark.shuffle.MetadataFetchFailedException:缺少shuffle 0的输出位置

时间:2016-03-17 05:24:31

标签: java apache-spark cassandra

我有大约24 GB的记录,我正在阅读和处理来自cassandra的火花。我使用了flatmaptopair和filter转换,然后使用datastax cassandra连接器存储RDD。但是当执行保存到cassandra的操作时,我的执行程序失败并且它开始抛出以下异常 -

16/03/17 03:00:32 WARN TaskSetManager: Lost task 11.1 in stage 3.0 (TID 133, 10.0.0.65): FetchFailed(null, shuffleId=0, mapId=-1, reduceId=11, message=
org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output location for shuffle 0
        at org.apache.spark.MapOutputTracker$$anonfun$org$apache$spark$MapOutputTracker$$convertMapStatuses$2.apply(MapOutputTracker.scala:460)
        at org.apache.spark.MapOutputTracker$$anonfun$org$apache$spark$MapOutputTracker$$convertMapStatuses$2.apply(MapOutputTracker.scala:456)
        at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
        at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
        at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
        at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
        at org.apache.spark.MapOutputTracker$.org$apache$spark$MapOutputTracker$$convertMapStatuses(MapOutputTracker.scala:456)
        at org.apache.spark.MapOutputTracker.getMapSizesByExecutorId(MapOutputTracker.scala:183)
        at org.apache.spark.shuffle.hash.HashShuffleReader.read(HashShuffleReader.scala:47)
        at org.apache.spark.rdd.ShuffledRDD.compute(ShuffledRDD.scala:90)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
        at org.apache.spark.scheduler.Task.run(Task.scala:88)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)

在检查exectur日志时,它显示以下错误 -

  

错误MapOutputTracker:缺少shuffle 0的输出位置

一旦它抛出异常火花再次重启所有阶段。我的群集有2个节点,16 GB内存和4个内核。我正在运行spark作为独立群集,每个工作节点分配12 gb。此外,当我在本地机器上运行10 GB数据的工作时,它的工作非常好。我已尝试将持久性级别更改为DISK_ONLY和MEMORY_AND_DISK但无效。

编辑1 -

我已经能够弄清楚当我通过键操作进行减少时,洗牌是失败的,因为一个键的聚合记录总计达250 mb。这是我的reducebykey片段 -

JavaPairRDD<String,Products[]> categoryMapCollection=categoryMapFiltered.reduceByKey(
                new Function2<Products[], Products[], Products[]>() {
                    @Override
                    public Products[] call(Products[] p1,Products[] p2)
                    {
                        Products[] both = (Products[])ArrayUtils.addAll(p1, p2);
                        return both;
                    }
                });

0 个答案:

没有答案