我正在尝试按数据框中的两列来订购SparkDataFrame
。
示例如下:
library(magrittr)
library(SparkR)
cars <- cbind(model = rownames(mtcars), mtcars)
carsDF <- createDataFrame(cars)
以下失败:
carsDF %>%
mutate(rank = over(rank(), orderBy(windowPartitionBy(column("cyl")), desc(column("mpg")), desc(column("disp"))))) %>%
head()
我觉得这与column()
函数的使用有关,原因是它也失败了:
carsDF %>%
mutate(rank = over(rank(), orderBy(windowPartitionBy(column("cyl")), column("mpg"), column("disp")))) %>%
head()
以下两个命令(带有和不带有orderBy()
函数的column()
的单个参数)均成功:
carsDF %>%
mutate(rank = over(rank(), orderBy(windowPartitionBy("cyl"), "mpg"))) %>%
head()
carsDF %>%
mutate(rank = over(rank(), orderBy(windowPartitionBy(column("cyl")), column("mpg")))) %>%
head()