我想创建一个函数,可以计算可变数量的最后观察值和不同变量的移动平均值。以此作为模拟数据:
df = expand.grid(site = factor(seq(10)),
year = 2000:2004,
day = 1:50)
df$temp = rpois(dim(df)[1], 5)
计算1个变量和固定数量的最后观察值。例如。这计算过去5天的平均温度:
library(dplyr)
library(zoo)
df <- df %>%
group_by(site, year) %>%
arrange(site, year, day) %>%
mutate(almost_avg = rollmean(x = temp, 5, align = "right", fill = NA)) %>%
mutate(avg = lag(almost_avg, 1))
到目前为止一切顺利。现在尝试功能化失败。
avg_last_x <- function(dataframe, column, last_x) {
dataframe <- dataframe %>%
group_by(site, year) %>%
arrange(site, year, day) %>%
mutate(almost_avg = rollmean(x = column, k = last_x, align = "right", fill = NA)) %>%
mutate(avg = lag(almost_avg, 1))
return(dataframe) }
avg_last_x(dataframe = df, column = "temp", last_x = 10)
我收到此错误:
Error in mutate_impl(.data, dots) : k <= n is not TRUE
我知道这可能与evaluation mechanism in dplyr有关,但我没有修复它。
提前感谢您的帮助。
答案 0 :(得分:6)
这应该解决它。
library(lazyeval)
avg_last_x <- function(dataframe, column, last_x) {
dataframe %>%
group_by(site, year) %>%
arrange(site, year, day) %>%
mutate_(almost_avg = interp(~rollmean(x = c, k = last_x, align = "right",
fill = NA), c = as.name(column)),
avg = ~lag(almost_avg, 1))
}