Show函数在DataFrame上失败,出现异常->原因:java.io.NotSerializableException:MapPartition返回对象上的java.lang.Object

时间:2018-06-23 09:18:39

标签: scala apache-spark dataframe apache-spark-sql apache-spark-2.0

我处于Excetion之下,在打印从mapPartitions返回的Dataframe数据时,

Caused by: java.io.NotSerializableException: java.lang.Object
Serialization stack:
    - object not serializable (class: java.lang.Object, value: java.lang.Object@382eb59a)
    - field (class: sql.MapPartitions$$anonfun$1, name: nonLocalReturnKey1$1, type: class java.lang.Object)
    - object (class sql.MapPartitions$$anonfun$1, <function1>)
    - field (class: org.apache.spark.sql.execution.MapPartitionsExec, name: func, type: interface scala.Function1)
    - object (class org.apache.spark.sql.execution.MapPartitionsExec, MapPartitions <function1>, obj#22: org.apache.spark.sql.Row
+- DeserializeToObject createexternalrow(_c0#10.toString, _c1#11.toString, _c2#12.toString, _c3#13.toString, StructField(_c0,StringType,true), StructField(_c1,StringType,true), StructField(_c2,StringType,true), StructField(_c3,StringType,true)), obj#21: org.apache.spark.sql.Row
   +- *(1) FileScan csv [_c0#10,_c1#11,_c2#12,_c3#13] Batched: false, Format: CSV, Location: InMemoryFileIndex[file:/C:/Users/Vikas Singh/Documents/Vikas/Study/Spark Material/spark/sample_fi..., PartitionFilters: [], PushedFilters: [], ReadSchema: struct<_c0:string,_c1:string,_c2:string,_c3:string>
)

以下是代码段:

val dataDF = spark.read.format("csv")
  .option("header", "false")
  .load("dataFile.csv")

val schema = StructType(Seq(
  StructField("year", StringType),
  StructField("make", StringType),
  StructField("model", StringType)
))
val encoder = RowEncoder(schema)
val transformed = lines.mapPartitions(partiotion => {
  val recMapPartitions = partiotion.map(rec => {
    rec.getAs(1)
  })

  return recMapPartitions
})(encoder)

transformed.show()

可以请你帮忙!

0 个答案:

没有答案