Scala Spark:在数据框中将双列转换为日期时间列

时间:2016-08-29 14:36:53

标签: scala date apache-spark

我正在尝试编写代码以将日期时间列date和last_updated_date转换为实际上将unix时间转换为双精度为“mm-dd-yyyy”格式以供显示。我该怎么做?

import org.joda.time._
import scala.tools._
import org.joda.time.format.DateTimeFormat._
import java.text.SimpleDateFormat
import org.apache.spark.sql.functions.{unix_timestamp, to_date}
root
 |-- date: double (nullable = false)
 |-- last_updated_date: double (nullable = false)
 |-- Percent_Used: double (nullable = false)

+------------+---------------------+------------+
|        date|    last_updated_date|Percent_Used|
+------------+---------------------+------------+
| 1.453923E12|        1.47080394E12| 1.948327124|
|1.4539233E12|        1.47080394E12| 2.019636442|
|1.4539236E12|        1.47080394E12| 1.995299371|
+------------+---------------------+------------+

3 个答案:

答案 0 :(得分:2)

转换为时间戳:

df.select(col("date").cast("timestamp"));

答案 1 :(得分:1)

使用from_unixtime将其转换为时间戳:

df.select(from_unixtime("date").as("date"))

答案 2 :(得分:1)

Fetching datetime from float in Python

这个答案对我有用,实际上尝试一下它的秒计算

import datetime serial = 43822.59722222222 seconds = (serial - 25569) * 86400.0 print(datetime.datetime.utcfromtimestamp(seconds))

Convert excel timestamp double value into datetime or timestamp