java.lang.ClassCastException:org.apache.spark.sql.catalyst.expressions.GenericRow无法强制转换为scala.collection.Iterator

时间:2016-04-22 03:07:41

标签: apache-spark dataframe bigdata

任何人都可以提供帮助

当我从RDD创建dateFrame时,我遇到了这个问题。

  

[ERROR] [2016-04-22 18:21:46] [HBaseOperator:load:140]失败:   org.apache.spark.SparkException:作业因阶段失败而中止:阶段2.0中的任务0失败4次,最近失败:阶段2.0中失去的任务0.3(TID 5,host26):java.lang.ClassCastException:org.apache .spark.sql.catalyst.expressions.GenericRow无法强制转换为scala.collection.Iterator           在org.apache.spark.sql.SQLContext $$ anonfun $ 9.apply(SQLContext.scala:519)           在scala.collection.Iterator $$ anon $ 11.next(Iterator.scala:328)           在scala.collection.Iterator $$ anon $ 11.next(Iterator.scala:328)           在org.apache.spark.sql.execution.Aggregate $$ anonfun $ doExecute $ 1 $$ anonfun $ 6.apply(Aggregate.scala:130)           在org.apache.spark.sql.execution.Aggregate $$ anonfun $ doExecute $ 1 $$ anonfun $ 6.apply(Aggregate.scala:126)           在org.apache.spark.rdd.RDD $$ anonfun $ mapPartitions $ 1 $$ anonfun $ apply $ 17.apply(RDD.scala:686)           在org.apache.spark.rdd.RDD $$ anonfun $ mapPartitions $ 1 $$ anonfun $ apply $ 17.apply(RDD.scala:686)           在org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)           在org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)           在org.apache.spark.rdd.RDD.iterator(RDD.scala:244)           在org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)           在org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)           在org.apache.spark.rdd.RDD.iterator(RDD.scala:244)           在org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70)           在org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)           在org.apache.spark.scheduler.Task.run(Task.scala:70)           在org.apache.spark.executor.Executor $ TaskRunner.run(Executor.scala:213)           在java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)           在java.util.concurrent.ThreadPoolExecutor $ Worker.run(ThreadPoolExecutor.java:617)

当我通过纱线客户提交它时它起作用。而当我通过我的代码提交它将得到错误 我的代码:

        StructType type = new StructType(fields);
        scala.collection.immutable.List<StructField> list =  type.toList();
        for (StructField structField : fields)
        {
            LOGGER.info(structField.name() + " " + structField.dataType().json());
        }
        dataFrame = sqlContext.createDataFrame(rowRdd, type);
        LOGGER.debug("dataframe num " +  dataFrame.count());

它无法创建数据帧

任何人都可以帮助我吗?

0 个答案:

没有答案