如果在Window函数中其他则触发scala

时间:2018-06-20 18:54:19

标签: scala apache-spark apache-spark-sql

我正在尝试将if-else与Spark中的窗口函数结合使用

输入DF:

col1 .  col2 .   TimeStamp1  TimeStamp 2
 10 .    1 .      10:00 .     11:00
 20 .    1 .      2:00 .       3:00
 20 .    2 .      4:00 .       5:00
 20 .    3 .      6:00 .       7:00

窗口:

time_window =  Window.partitionBy($"col1").orderBy($"col2")

用例是

的组合
(col1, max(col2) === 1): then new_col = (unix_timestamp (TimeStamp1) - unix_timestamp (TimeStamp2)).over(time_window)

其他:

//I only need TimeStamp1 to create the lag
new_col = (unix_timestamp($"TimeStamp1") - unix_timestamp(lag($"TimeStamp1", 1))).over(time_window)

代码:

df.withColumn("new_col", when(max($"col1").over(time_window) === "1", (unix_timestamp($"TimeStamp1") -unix_timestamp($"TimeStamp2")).otherwise((unix_timestamp($"TimeStamp1") - unix_timestamp(lag($"TimeStamp1", 1).over(time_window)))/3600.0)))

错误:

java.lang.IllegalArgumentException: otherwise() can only be applied on a Column previously generated by when()

关于我要去哪里的任何建议,或者有其他方法可以实现此建议。谢谢。

0 个答案:

没有答案