将DataFrame字符串列转换为时间戳

时间:2019-11-17 02:15:29

标签: scala apache-spark

我正在尝试以下代码将字符串日期列转换为时间戳列:

val df = Seq(
    ("19-APR-2019 10:11:10"),
    ("19-MAR-2019 10:11:10"),
    ("19-FEB-2019 10:11:10")
  ).toDF("date")
  .withColumn("new_date", to_utc_timestamp(to_date('date, "dd-MMM-yyyy hh:mm:ss"), "UTC"))

  df.show

几乎可以运行,但是却浪费了时间

+--------------------+-------------------+
|                date|           new_date|
+--------------------+-------------------+
|19-APR-2019 10:11:10|2019-04-19 00:00:00|
|19-MAR-2019 10:11:10|2019-03-19 00:00:00|
|19-FEB-2019 10:11:10|2019-02-19 00:00:00|
+--------------------+-------------------+

您有任何想法或其他解决方案吗?

1 个答案:

答案 0 :(得分:0)

SMaz在评论中提到,以下几行打勾:

import org.apache.sql.functions.to_timestamp

df.withColumn("new_date", to_timestamp('date, "dd-MMM-yyyy hh:mm:ss"))