我有一个采用以下方案的数据(df_1)
|-- Column1: string (nullable = true)
|-- Column2: string (nullable = true)
|-- Column3: long (nullable = true)
|-- Column4: double (nullable = true)
df_1的类型为“ pyspark.sql.dataframe.DataFrame”
我想创建一个新列作为Rank,按照定义的窗口(security_window)函数对行进行排名;
import pyspark.sql.functions as F
from pyspark.sql import Window
window=Window.partitionBy(F.col("Column1"),F.col('Column2')).orderBy(F.col("Column3"))).rangeBetween(-20,0)
df_1.withColumn('Rank',F.rank().over(window))
但是,当我将此窗口函数与提到的数据框(df_1)一起使用时, 我面临以下异常作为AnalysisException。有人知道是什么原因吗?
pyspark.sql.utils.AnalysisException: Window Frame RANGE BETWEEN 20 PRECEDING AND CURRENT ROW must match the required frame ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW