pyspark-生成日期序列

时间:2020-03-17 15:27:32

标签: date pyspark sequence aws-glue

我正在尝试生成日期序列

df1 = df.withColumn("start_dt" ,F.to_date(F.col("start_date"),"yyyy-mm-dd")).withColumn("end_dt", F.to_date(F.col("end_date"),"yyyy-mm-dd"))
df1.select("start_dt","end_dt").show()

print("type(start_dt) ",type("start_dt"))
print("type(end_dt) " ,type("end_dt"))
df2 =df1.withColumn("lineoffdate", F.expr("""sequence(start_dt,end_dt,1)"""))

**下面是输出

+----+----------+----------+
|   start_date  |  end_date|
+----+----------+----------+
|   2020-02-01  |2020-03-21|
+-------------------+------+

type(start_dt)  <class 'str'>
type(end_dt)  <class 'str'>

由于数据类型不匹配,无法解析'sequence(start_dtend_dt,1)':序列仅支持整数,时间戳或日期类型;第1行pos 0;

即使将起始dt和结束dt转换为日期或时间戳记后,我仍然看到列的类型仍为str并在生成日期序列时遇到上述错误。

1 个答案:

答案 0 :(得分:1)

您说它应该与import Foundation import Combine import SwiftUI class ViewRouter: ObservableObject { let objectWillChange: PassthroughSubject<ViewRouter,Never> @Published var currentPage: String = "page1" { didSet { objectWillChange.send(self) } } init(currentPage: String) { self.currentPage = currentPage } } date(日历类型)一起使用是正确的,但是,您犯的唯一错误是将timestamp放入了{{1 }}设为"step",此时应该是日历间隔(例如sequence):

integer
相关问题