我想将String列转换为Date列,但是结果是我收到的列值为空。
from pyspark.sql.functions import expr, from_unixtime, dayofmonth, unix_timestamp, year, to_date, col
dane3.withColumn('date', to_date(unix_timestamp(col('dateRep'),'%d.%m.%Y').cast("timestamp"))).show()
答案 0 :(得分:1)
您不需要像SQL那样在%
之前使用d/m/y
。更多信息in the docs
df = spark.createDataFrame(
[
(1, '27.08.2020'),
(2, '27.08.2019'),
],
['id', 'txt']
)
df = df.withColumn('formatted',
to_date(unix_timestamp(col('txt'), 'dd.MM.yyyy').cast("timestamp")))
df.show()
+---+----------+----------+
| id| txt| formatted|
+---+----------+----------+
| 1|27.08.2020|2019-12-29|
| 2|27.08.2019|2018-12-30|
+---+----------+----------+