import sparkSession.sqlContext.implicits._
val df = Seq(("2014-10-06"), ("2014-10-07"), ("2014-10-08"), ("2014-10-09"), ("2014-10-10")).toDF("DATE")
df.printSchema()
import org.apache.spark.sql.functions.{col, to_date}
val df2 = df.withColumn("DATE", to_date(col("DATE"), "yyyy-MM-dd"))
df2.printSchema()
df2.write.mode(SaveMode.Overwrite).parquet("C:\\TEMP\\")
root
|-- DATE: string (nullable = true)
root
|-- DATE: date (nullable = true)
在代码中,我能够将DATE列从字符串转换为日期类型,但是在打开时输出实木复合地板文件出现以下错误时:
Parquet.ParquetException:致命错误,读取列“ DATE” System.ArgumentException:本地dateTime参数的UTC偏移量与偏移量参数不匹配。
有人可以帮我吗?
答案 0 :(得分:1)
我无法复制此内容-
尝试读写相同内容
val df1 = Seq(("2014-10-06"), ("2014-10-07"), ("2014-10-08"), ("2014-10-09"), ("2014-10-10")).toDF("DATE")
df1.printSchema()
/**
* root
* |-- DATE: string (nullable = true)
*/
import org.apache.spark.sql.functions.{col, to_date}
val df2 = df1.withColumn("DATE", to_date(col("DATE"), "yyyy-MM-dd"))
df2.printSchema()
/**
* root
* |-- DATE: date (nullable = true)
*/
df2.write.mode(SaveMode.Overwrite).parquet("/Users/sokale/models/stack")
spark.read.parquet("/Users/sokale/models/stack").show(false)
/**
* +----------+
* |DATE |
* +----------+
* |2014-10-08|
* |2014-10-09|
* |2014-10-10|
* |2014-10-06|
* |2014-10-07|
* +----------+
*/