镶木地板文件出现异常:本地dateTime参数的UTC偏移量与偏移量参数不匹配

时间:2020-06-24 10:46:31

标签: dataframe apache-spark apache-spark-sql

import sparkSession.sqlContext.implicits._
val df = Seq(("2014-10-06"), ("2014-10-07"), ("2014-10-08"), ("2014-10-09"), ("2014-10-10")).toDF("DATE")
df.printSchema()

import org.apache.spark.sql.functions.{col, to_date}
val df2 = df.withColumn("DATE", to_date(col("DATE"), "yyyy-MM-dd"))
df2.printSchema()

df2.write.mode(SaveMode.Overwrite).parquet("C:\\TEMP\\")

root
 |-- DATE: string (nullable = true)

root
 |-- DATE: date (nullable = true)

在代码中,我能够将DATE列从字符串转换为日期类型,但是在打开时输出实木复合地板文件出现以下错误时:

Parquet.ParquetException:致命错误,读取列“ DATE” System.ArgumentException:本地dateTime参数的UTC偏移量与偏移量参数不匹配。

有人可以帮我吗?

1 个答案:

答案 0 :(得分:1)

我无法复制此内容-

尝试读写相同内容

  val df1 = Seq(("2014-10-06"), ("2014-10-07"), ("2014-10-08"), ("2014-10-09"), ("2014-10-10")).toDF("DATE")
    df1.printSchema()

    /**
      * root
      * |-- DATE: string (nullable = true)
      */
    import org.apache.spark.sql.functions.{col, to_date}
    val df2 = df1.withColumn("DATE", to_date(col("DATE"), "yyyy-MM-dd"))
    df2.printSchema()

    /**
      * root
      * |-- DATE: date (nullable = true)
      */

    df2.write.mode(SaveMode.Overwrite).parquet("/Users/sokale/models/stack")

    spark.read.parquet("/Users/sokale/models/stack").show(false)

    /**
      * +----------+
      * |DATE      |
      * +----------+
      * |2014-10-08|
      * |2014-10-09|
      * |2014-10-10|
      * |2014-10-06|
      * |2014-10-07|
      * +----------+
      */