Spark dataFrame将数据类型从String转换为Date

时间:2016-05-23 09:06:10

标签: scala apache-spark spark-dataframe

我有以下数据,包含架构

scala> df2.printSchema()
root
 |-- RowID: integer (nullable = true)
 |-- Order Date: string (nullable = true)

scala> df2.show(5)
+-----+----------+
|RowID|Order Date|
+-----+----------+
|    1|   4/10/15|
|   49|   4/10/15|
|   50|   4/10/15|
|   80|   4/10/15|
|   85|   4/10/15|
+-----+----------+

我想转换"订单日期"字符串列到日期数据类型,并尝试以下没有运气,任何人都可以建议一个更好的方法来做到这一点?

scala> df2.select(df2.col("RowID"), df2.col("Order Date"), date_format(df2.col("Order Date"), "M/dd/yy")).show(5)
+-----+----------+-------------------------------+
|RowID|Order Date|date_format(Order Date,M/dd/yy)|
+-----+----------+-------------------------------+
|    1|   4/10/15|                           null|
|   49|   4/10/15|                           null|
|   50|   4/10/15|                           null|
|   80|   4/10/15|                           null|
|   85|   4/10/15|                           null|
+-----+----------+-------------------------------+

1 个答案:

答案 0 :(得分:1)

管理转换为unix纪元时间戳,我想从这里直截了当

scala> df.select(df.col("RowID"), df.col("Order Date"), unix_timestamp(df.col("Order Date"), "M/d/yy")).show(5)
+-----+----------+--------------------------------+
|RowID|Order Date|unixtimestamp(Order Date,M/d/yy)|
+-----+----------+--------------------------------+
|    1|   4/10/15|                      1428604200|
|   49|   4/10/15|                      1428604200|
|   50|   4/10/15|                      1428604200|
|   80|   4/10/15|                      1428604200|
|   85|   4/10/15|                      1428604200|
+-----+----------+--------------------------------+