我有DataFrame
以下数据
scala> nonFinalExpDF.show
+---+----------+
| ID| DATE|
+---+----------+
| 1| null|
| 2|2016-10-25|
| 2|2016-10-26|
| 2|2016-09-28|
| 3|2016-11-10|
| 3|2016-10-12|
+---+----------+
从此DataFrame
我想要低于DataFrame
+---+----------+----------+
| ID| DATE| INDICATOR|
+---+----------+----------+
| 1| null| 1|
| 2|2016-10-25| 0|
| 2|2016-10-26| 1|
| 2|2016-09-28| 0|
| 3|2016-11-10| 1|
| 3|2016-10-12| 0|
+---+----------+----------+
逻辑 -
请建议我这样做的简单逻辑。
答案 0 :(得分:2)
尝试
df.createOrReplaceTempView("df")
spark.sql("""
SELECT id, date,
CAST(LEAD(COALESCE(date, TO_DATE('1900-01-01')), 1)
OVER (PARTITION BY id ORDER BY date) IS NULL AS INT)
FROM df""")