抱歉,我可能使用了错误的搜索字词,但我找不到解决方法。
给出两个参与者(id)的实验,每个参与者在两个不同参数(par1,par2)下执行任务6次:
id <- c(rep(1,6),rep(2,6))
par1 <- c(rep("a",9),rep("b",3))
par2 <- c(rep("c",3),rep("d",9))
val <- rnorm(12)
data <- data.frame(id,par1,par2,val)
如何将“id”,“par1”和“par2”的相同值替换为一行,其中“val”的值是替换行的“val”值的平均值?
结果就是这样一个表格:
id par1 par2 val
1 a c (mean of row 1-3)
1 a d (mean of row 4-6)
2 a d (mean of row 7-9)
2 b d (mean of row 10-12)
答案 0 :(得分:2)
对于dplyr
方法:
library(dplyr)
set.seed(123) # for reproducibility
id <- c(rep(1, 6), rep(2, 6))
par1 <- c(rep("a", 9), rep("b", 3))
par2 <- c(rep("c", 3), rep("d", 9))
val <- rnorm(12)
data <- data.frame(id, par1, par2, val)
# group by all variables except `val`
data %>% group_by_at(vars(-val)) %>% summarize(val = mean(val))
给出了:
# A tibble: 4 x 4
# Groups: id, par1 [?]
id par1 par2 val
<dbl> <fctr> <fctr> <dbl>
1 1 a c 0.2560184
2 1 a d 0.6382870
3 2 a d -0.4969993
4 2 b d 0.3794112
答案 1 :(得分:1)
以下是data.table
的选项。转换&#39; data.frame&#39;到&#39; data.table&#39; (setDT(data)
),按&#39; id&#39;,&#39; par1&#39;,&#39; par2&#39;分组,获取{&#39; val&mean
#39;
library(data.table)
setDT(data)[, .(val = mean(val)), by = .(id, par1, par2)]