我发现R包data.table在交互式控制台上使用时非常有用。 但是当在函数中使用它时会让事情变得更加棘手
library(data.table)
flights <- fread("https://github.com/arunsrinivasan/flights/wiki/NYCflights14/flights14.csv")
flights[origin == "JFK" & month == 6L,
.(m_arr = mean(arr_delay), m_dep = mean(dep_delay))]
但这失败了:
x="arr_delay" # x and y are passed from arguments of a function
y="dep_delay"
flights[origin == "JFK" & month == 6L,
.(m_arr = mean(x), m_dep = mean(y))]
是否有解决方法?
答案 0 :(得分:3)
选项是在.SDcols
中指定,然后从mean
获取SD
setnames(flights[origin == "JFK" & month == 6L,
lapply(.SD, mean), .SDcols = c(x, y)], c('m_arr', 'm_dep'))[]
# m_arr m_dep
#1: 5.839349 9.807884
它可以包含在函数中
f1 <- function(dat, col1, col2) {
setnames(dat[origin == "JFK" & month == 6L,
lapply(.SD, mean), .SDcols = c(col1, col2)], c('m_arr', 'm_dep'))[]
}
f1(flights, x, y)
如果我们不想这样做,那么get
是获取值的选项
flights[origin == "JFK" & month == 6L,
.(m_arr = mean(get(x)), m_dep = mean(get(y)))]
# m_arr m_dep
#1: 5.839349 9.807884
或另一个选项是eval(as.name
f2 <- function(dat, col1, col2) {
dat[origin == "JFK" & month == 6L,
.(m_arr = mean(eval(as.name(col1))), m_dep = mean(eval(as.name(col2))))]
}
f2(flights, x, y)
# m_arr m_dep
#1: 5.839349 9.807884
使用tidyverse
的选项将是
f3 <- function(dat, col1, col2) {
dat %>%
filter(origin == "JFK", month == 6L) %>%
summarise_at(vars(col1, col2), mean) %>%
rename(m_arr := !! rlang::sym(col1),
m_dep := !! rlang::sym(col2))
}
f3(flights, x, y)
# m_arr m_dep
#1 5.839349 9.807884