Spark 2.0如何在scala中将DF Date / timstamp列转换为另一种日期格式?

时间:2019-06-04 06:10:42

标签: scala apache-spark dataframe apache-spark-sql

为了学习,我一直在使用以下示例数据集。

+-------------------+-----+-----+-----+-----+-------+
|             MyDate| Open| High|  Low|Close| Volume|
+-------------------+-----+-----+-----+-----+-------+
|2006-01-03 00:00:00|983.8|493.8|481.1|492.9|1537660|
|2006-01-04 00:00:00|979.6|491.0|483.5|483.8|1871020|
|2006-01-05 00:00:00|972.2|487.8|484.0|486.2|1143160|
|2006-01-06 00:00:00|977.8|489.0|482.0|486.2|1370250|
|2006-01-09 00:00:00|973.4|487.4|483.0|483.9|1680740|
+-------------------+-----+-----+-----+-----+-------+

我试图将“ MyDate”列值更改为“ YYYY-MON”之类的不同格式,并这样写。.

citiDataDF.withColumn("New-Mydate",to_timestamp($"MyDate", "yyyy-MON")).show(5)

执行代码后,找到新列“ New-Mydate”。但我看不到所需的输出格式。你能帮忙吗

1 个答案:

答案 0 :(得分:0)

您需要date_format而不是to_timestamp

val citiDataDF = List("2006-01-03 00:00:00").toDF("MyDate")
citiDataDF.withColumn("New-Mydate",date_format($"New-Mydate", "yyyy-MMM")).show(5)

结果:

+-------------------+----------+
|             MyDate|New-Mydate|
+-------------------+----------+
|2006-01-03 00:00:00|  2006-Jan|
+-------------------+----------+

注意::三个“ M”表示月份为字符串,如果希望一个月份为Int,则只能使用两个“ M”