我有一张表,每天有150多个变量,跨越5年。我想为每个月创建每个变量的每日平均摘要。但是,如果月份是1月,5月,7月,9月,11月或12月,我想将所有值的总和除以计数 - 1.
dplyr的summarise_each适用于我想做的事情。但是,我没有成功将自定义函数集成到funs参数中:
by.ym <- training %>% filter(Day.W!=1) %>% group_by(training, year=year(Date), month=month(Date))
testb <- summarise_each(by.ym[,-c(1:3)],
funs(. / (if (month %in% c(1, 5, 7, 9, 11, 12)) {
sum(.)/(nrow(.)-1)
} else mean(.))
))
错误消息是:
Error: expecting a single value
In addition: Warning messages:
1: In if (c(10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, :
the condition has length > 1 and only the first element will be used
2: In if (c(10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, :
the condition has length > 1 and only the first element will be used
答案 0 :(得分:1)
将评论建议放在一起,并使用iris作为测试数据:
library(dplyr)
library(tidyr)
multipliers = data_frame(
month = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12),
bevel = c(1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 1)
)
iris %>%
select(-Species) %>%
mutate(month = 1:12 %>% rep(length.out = n()) ) %>%
gather(variable, value, -month) %>%
left_join(multipliers) %>%
group_by(month, variable) %>%
summarize(value = sum(value) / (n() - first(bevel))) %>%
spread(variable, value)