我有一列时间戳记作为String。我想将它们转换为“ yyyy-MM-dd”格式的日期
+-------------------+
| date_col|
+-------------------+
|2019-01-01 08:01:45|
|2019-01-02 17:17:25|
|2019-01-03 15:01:45|
+-------------------+
我希望将'2019-01-01','2019-01-02','2019-01-03'作为输出
答案 0 :(得分:0)
使用子字符串和截止日期:
from pyspark.sql import Row
from pyspark.sql.functions import to_date, substring, col
df = sc.parallelize([Row(date_col="2019-01-01 08:01:45"),Row(date_col="2019-01-02 17:17:25"),Row(date_col="2019-01-03 15:01:45")]).toDF()
df = df.withColumn("new_date", to_date(substring(col("date_col"),0,10), "yyyy-MM-dd"))
df.show()
+-------------------+----------+
| date_col| new_date|
+-------------------+----------+
|2019-01-01 08:01:45|2019-01-01|
|2019-01-02 17:17:25|2019-01-02|
|2019-01-03 15:01:45|2019-01-03|
+-------------------+----------+