我正在尝试使用pyspark从表中提取一个值,我需要采用以下格式的值:2020-06-17T15:08:24z
df = spark.sql('select max(lastModDt)as lastModDate from db.tbl')
jobMetadata = existingMaxModifiedDate.withColumn("maxDate", date_format(to_timestamp(existingMaxModifiedDate.lastModDate, "yyyy-mm-dd HH:MM:SS.SSS"), "yyyy-mm-dd HH:MM:SS.SSS"))
但是,对于创建的列“ maxDate”,我一直为null。谢谢。
答案 0 :(得分:1)
也许这很有用-
val timeDF = spark.sql(
"""
|select current_timestamp() as time1,
| translate(date_format(current_timestamp(), 'yyyy-MM-dd HH:mm:ssZ') ,' ', 'T') as time2,
| translate(date_format(current_timestamp(), 'yyyy-MM-dd#HH:mm:ss$') ,'#$', 'Tz') as time3
""".stripMargin)
timeDF.show(false)
timeDF.printSchema()
/**
* +-----------------------+------------------------+--------------------+
* |time1 |time2 |time3 |
* +-----------------------+------------------------+--------------------+
* |2020-06-30 21:22:04.541|2020-06-30T21:22:04+0530|2020-06-30T21:22:04z|
* +-----------------------+------------------------+--------------------+
*
* root
* |-- time1: timestamp (nullable = false)
* |-- time2: string (nullable = false)
* |-- time3: string (nullable = false)
*/