窗口函数/ scala / spark 1.6

时间:2017-02-02 16:37:11

标签: scala window-functions apache-spark-1.6

我想在Scala中使用窗口函数。

我有一个CSV文件,如下所示:

id;date;value1
1;63111600000;100
1;63111700000;200
1;63154800000;300

当我尝试在此数据框上应用窗口函数时,  有时它会起作用,有时会失败:

val df = loadCSVFile() 
val tw = Window.orderBy(date).partitionBy(id).rangeBetween(-5356800000,0) 
df.withColumn(value1___min_2_month, min(df.col("value1")).over(tw))
+---+-----------+--------------------+
| id|       date|value1___min_2_month|
+---+-----------+--------------------+
|  1|63111600000|                 100|
|  1|63111700000|                 100|
|  1|63154800000|                 100|
+---+-----------+--------------------+

所以它有效!但是当我尝试使用更大的数字(包含前一个例子的行)时,我得到以下结果

val tw = 

Window.orderBy(date).partitionBy(id).rangeBetween(-8035200000,0) \n
df.withColumn(value1___min_3_month, min(df.col("value1")).over(tw)) 
+---+-----------+--------------------+
| id|       date|value1___min_3_month|
+---+-----------+--------------------+
|  1|63111600000|                null|
|  1|63111700000|                null|
|  1|63154800000|                null|
+---+-----------+--------------------+

1 个答案:

答案 0 :(得分:2)

使用L

后缀您的号码
scala> -10000000000
<console>:1: error: integer number too large
-10000000000
 ^

scala> -10000000000L
res0: Long = -10000000000