考虑对数据框进行dplyr处理:
existing.df <- filter(existing.df, justanEx > 0) %>%
arrange(desc(justanEx)) %>%
mutate(mean = mean(justanEx),
median = median(justanEx),
rank = seq_len(length(anotherVar)))
我必须在我正在做的工作上做很多事情,所以我尝试为它做一个功能:
df.overZ <- function(data, var){
df <- data %>% filter(var > 0) %>%
arrange_(desc((var))) %>%
mutate(mean = mean(var),
median = median(var),
rank = seq_len(length(anotherVar)))
df
}
和他们
existing.df <- df.overZ(existing.df, "realVar")
但这给了我这个错误:
Error in arrange_impl(.data, dots) :
incorrect size (1), expecting : 50000
如果我尝试:
existing.df <- df.overZ(existing.df, realVar)
我收到此错误:
Error in filter_impl(.data, dots) : obj 'realVar' not found
我已经尝试过filter_,arrange_和mutate _,
但没有任何意义上的工作。
这可以吗?
以下功能有效:
make.df <- function(var, n){
df <- orign.df %>% filter(!is.na(var)) %>%
select(1:2,n,3:6)
df
}
existing.df <- make.df("oneVar",7)
答案 0 :(得分:2)
使用devel版本dplyr
(即将发布0.6.0
),我们可以使用quosures
library(dplyr)
df.overZ <- function(data, Var){
Var <- enquo(Var)
data %>%
filter(UQ(Var) > 0) %>%
arrange(desc(UQ(Var))) %>%
mutate(Mean = mean(UQ(Var)),
Median = median(UQ(Var)),
rank = row_number())
}
df.overZ(iris, Sepal.Length)
我们可以将此函数扩展为group_by
选项
df.overZ2 <- function(data, Var, grpVar){
Var <- enquo(Var)
grpVar <- enquo(grpVar)
newVar <- paste(quo_name(Var), c("Mean", "Median", "Rank"), sep="_")
data %>%
filter(UQ(Var) > 0) %>%
arrange(desc(UQ(Var))) %>%
group_by(UQ(grpVar)) %>%
summarise(UQ(newVar[1]) := mean(UQ(Var)),
UQ(newVar[2]) := median(UQ(Var)),
UQ(newVar[3]) := n())
}
df.overZ2(iris, Sepal.Length, Species)
# A tibble: 3 × 4
# Species Sepal.Length_Mean Sepal.Length_Median Sepal.Length_Rank
# <fctr> <dbl> <dbl> <int>
#1 setosa 5.006 5.0 50
#2 versicolor 5.936 5.9 50
#3 virginica 6.588 6.5 50
此处,enquo
通过获取输入参数并将其转换为substitute
,然后在函数内base R
,从quosure
执行与filter/arrange/mutate/summarise/group_by
类似的工作我们取消引用(!!
或UQ
)来评估它。我们还可以通过在作业的{lhs}上传递quosure
来命名列(:=
)