我正在尝试根据每个分类因素中另一个向量中mean
左右的值的扩展,从一个向量计算n
个x
值。
我附上了一个包含样本数据和预期结果的数据框。
基本上我正在研究来自许多湖泊的鱼类捕捞数据和声学估计。捕获和声学(gha)
数据是分层的,因为网络设置在不同的深度(一些深度缺失,一些深度重复)。我想通过汇集来自相邻深度层(±2m)
的捕获数据来增加原始深度层的大小。
预期结果(mean.catch,mean.g.ha)是手动计算的,其中mean.catch和mean.g.ha计算为n catch的平均值,其中depth x = (x & x±2)
为每个湖泊。< / p>
lake <- c("a","a","a","a","a","a","a","a", "b","b","b","b","b", "b","b","b","b","b", "b","b", "b","b","b","b")
net.id <- c(1,1,1,1,2,2,2,2,1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4)
catch <- c(0:23)
g.ha <- c(1:24)
depth <- c(0, 1, 3, 4, 6, 7, 9, 10, 0, 1, 3, 4, 11, 13, 14, 16, 11, 12, 14, 15, 20, 22, 23, 25)
mean.catch <- c(0.5, 1, 2, 3, 4, 5, 6, 6.5, 8.5, 9, 10, 10.5, 14.5, 15.57142857, 16, 16.5, 14.5, 15, 16, 15.8, 20.5, 21, 22, 22.5)
mean.g.ha <- c(1.5, 2, 3, 4, 5, 6, 7, 7.5, 9.5, 10, 11, 12, 15.5, 16.57142857, 17, 17.5, 15.5, 16, 17, 16.8, 21.5, 22, 23, 23.5)
df <- data.frame(lake, net.id, depth, catch, g.ha, mean.catch, mean.g.ha)
以下答案R - Faster Way to Calculate Rolling Statistics Over a Variable Interval 但是我必须为每个湖创建一个子集。是否有可能一次性将它应用于每个湖泊而不是重复代码并创建大量子集?
a <- subset(df, lake == "a")
as <- a[ ,c(1, 3)]
as
rollmean_r = function(x,y,xout,width) {
out = numeric(length(xout))
for( i in seq_along(xout) ) {
window = x >= (xout[i]-width) & x <= (xout[i]+width)
out[i] = .Internal(mean( y[window] ))
}
return(out)
}
x = a$depth
y = a$catch
As <- rollmean_r(x,y,xout=x,width=2)