我有一个数据框df1
,该数据框汇总每2米的水温,直至随时间推移达到39米的深度。例如:
df1<-data.frame(Datetime=c("2016-08-18 00:00:00","2016-08-18 00:01:00","2016-08-18 00:02:00","2016-08-18 00:03:00"),
Site=c("BD","HG","BD","HG"),
m0=c(2,5,6,1),
m2=c(3,5,2,4),
m4=c(4,1,9,3),
m6=c(2,5,6,1),
m8=c(3,5,2,4),
m10=c(2,5,6,1),
m12=c(4,1,9,3),
m14=c(3,5,2,4),
m16=c(2,5,6,1),
m18=c(4,1,9,3),
m20=c(3,5,2,4),
m22=c(2,5,6,1),
m24=c(4,1,9,3),
m26=c(3,5,2,4),
m28=c(2,5,6,1),
m30=c(4,1,9,3),
m32=c(3,5,2,4),
m34=c(2,5,6,1),
m36=c(4,1,9,3),
m38=c(3,5,2,4)
)
> df1
Datetime Site m0 m2 m4 m6 m8 m10 m12 m14 m16 m18 m20 m22 m24 m26 m28 m30 m32 m34 m36 m38
1 2016-08-18 00:00:00 BD 2 3 4 2 3 2 4 3 2 4 3 2 4 3 2 4 3 2 4 3
2 2016-08-18 00:01:00 HG 5 5 1 5 5 5 1 5 5 1 5 5 1 5 5 1 5 5 1 5
3 2016-08-18 00:02:00 BD 6 2 9 6 2 6 9 2 6 9 2 6 9 2 6 9 2 6 9 2
4 2016-08-18 00:03:00 HG 1 4 3 1 4 1 3 4 1 3 4 1 3 4 1 3 4 1 3 4
我想通过平均适当列之间的水温来计算8米而不是2米的水温。例如,我想将列m0
,m2
,m4
和m6
转换为唯一的列m3.5
,该列反映平均水温在0之间和7米深。
作为我想要的结果:
> df1
Datetime Site m3.5 m11.5 m19.5 m27.5 m35.5
1 2016-08-18 00:00:00 BD 2.75 3.00 2.75 3.25 3.00
2 2016-08-18 00:01:00 HG 4.00 4.00 4.00 3.00 4.00
3 2016-08-18 00:02:00 BD 5.75 4.75 5.75 6.50 4.75
4 2016-08-18 00:03:00 HG 2.25 3.00 2.25 2.75 3.00
有人用dplyr怎么做?
答案 0 :(得分:2)
这是一种可以处理任意数量列的解决方案
num_meters <- 39
grp <- as.factor(cumsum(seq(0,num_meters, 2) %% 8 == 0))
df <- data.frame(df1[,c(1,2)],
t(apply(df1[,-c(1,2)], 1, function(x) tapply(x, grp, mean))))
# Datetime Site X1 X2 X3 X4 X5
#1 2016-08-18 00:00:00 BD 2.75 3.00 2.75 3.25 3.00
#2 2016-08-18 00:01:00 HG 4.00 4.00 4.00 3.00 4.00
#3 2016-08-18 00:02:00 BD 5.75 4.75 5.75 6.50 4.75
#4 2016-08-18 00:03:00 HG 2.25 3.00 2.25 2.75 3.00
# in case you also need the colnames that you have specified
colnames(df)[-c(1,2)] <- paste("m", tapply(seq(0,num_meters, 2), grp, mean) + 0.5, sep = "")
答案 1 :(得分:2)
使用tidyverse
,您还可以执行以下操作:
df1 %>%
gather(var, val, -Datetime, -Site) %>%
mutate(group = rep(seq(3.5, 35.5, 8), each = 16)) %>%
group_by(group, Site, Datetime) %>%
summarise(value = mean(val)) %>%
spread(group, value)
Site Datetime `3.5` `11.5` `19.5` `27.5` `35.5`
<fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl>
1 BD 2016-08-18 00:00:00 2.75 3 2.75 3.25 3
2 BD 2016-08-18 00:02:00 5.75 4.75 5.75 6.5 4.75
3 HG 2016-08-18 00:01:00 4 4 4 3 4
4 HG 2016-08-18 00:03:00 2.25 3 2.25 2.75 3
答案 2 :(得分:1)
您可能正在寻找rowMeans
:
df1$m3.5 <- rowMeans(df1[, c("m0", "m2", "m4", "m6")])
不需要dplyr。
答案 3 :(得分:1)
以下将执行此操作。
library(dplyr)
df1 %>%
mutate(m3.5 = rowMeans(.[3:6]),
m11.5 = rowMeans(.[7:10]),
m19.5 = rowMeans(.[11:14]),
m27.5 = rowMeans(.[15:18]),
m35.5 = rowMeans(.[19:22])) %>%
select(Datetime, Site, m3.5:m35.5)
# Datetime Site m3.5 m11.5 m19.5 m27.5 m35.5
#1 2016-08-18 00:00:00 BD 2.75 3.00 2.75 3.25 3.00
#2 2016-08-18 00:01:00 HG 4.00 4.00 4.00 3.00 4.00
#3 2016-08-18 00:02:00 BD 5.75 4.75 5.75 6.50 4.75
#4 2016-08-18 00:03:00 HG 2.25 3.00 2.25 2.75 3.00