我有一个列表(dflist
),其中包含数据帧(dfX
),该数据帧包含针对样本集合(例如样本1-3; samp
)的度量。每个数据框本身都包含使用特定仪器(例如仪器1-3; inst
)测量的特定样品的测量值。例如,数据框1包含样品1的仪器1的测量值,数据框2包含样品1的仪器2的测量值,数据框3包含样品3的仪器1的测量值,依此类推。
> a1 <- c('a1', 'b1', 'c1')
> a2 <- c('a2', 'b2', 'c2')
> a3 <- c('a3', 'b3', 'c3')
> a4 <- c('a4', 'b4', 'c4')
> b1 <- c(1:3)
> b2 <- c(4:6)
> b3 <- c(7:9)
> b4 <- c(10:12)
> c1 <- c('samp1', 'samp1', 'samp1')
> c2 <- c('samp1', 'samp1', 'samp1')
> c3 <- c('samp2', 'samp2', 'samp2')
> c4 <- c('samp2', 'samp2', 'samp2')
> d1 <- c('inst1', 'inst1', 'inst1')
> d2 <- c('inst2', 'inst2', 'inst2')
> d3 <- c('inst1', 'inst1', 'inst1')
> d4 <- c('inst2', 'inst2', 'inst2')
> df1 <- data.frame(a1, b1, c1, d1)
> df2 <- data.frame(a2, b2, c2, d2)
> df3 <- data.frame(a3, b3, c3, d3)
> df4 <- data.frame(a4, b4, c4, d4)
> nams <- c('Reads', 'Mean_Val', 'Samp', 'Inst')
> dflist <- list(df1, df2, df3, df4)
> dflist <- lapply(dflist, setNames, nm=nams)
> dflist
[[1]]
Reads Mean_Val Samp Inst
1 a1 1 samp1 inst1
2 b1 2 samp1 inst1
3 c1 3 samp1 inst1
[[2]]
Reads Mean_Val Samp Inst
1 a2 4 samp1 inst2
2 b2 5 samp1 inst2
3 c2 6 samp1 inst2
[[3]]
Reads Mean_Val Samp Inst
1 a3 7 samp2 inst1
2 b3 8 samp2 inst1
3 c3 9 samp2 inst1
[[4]]
Reads Mean_Val Samp Inst
1 a4 10 samp2 inst2
2 b4 11 samp2 inst2
3 c4 12 samp2 inst2
我想做的是遍历列表并合并包含相同样本测量值的数据框(即,将df
乘以samp
),以得到如下输出:>
[[1]]
Reads Mean_Val Samp Inst
1 a1 1 samp1 inst1
2 b1 2 samp1 inst1
3 c1 3 samp1 inst1
4 a2 4 samp1 inst2
5 b2 5 samp1 inst2
6 c2 6 samp1 inst2
[[2]]
Reads Mean_Val Samp Inst
1 a3 7 samp2 inst1
2 b3 8 samp2 inst1
3 c3 9 samp2 inst1
4 a4 10 samp2 inst2
5 b4 11 samp2 inst2
6 c4 12 samp2 inst2
我相信解决方案将涉及merge
和subset
,但是我真的不知道如何做到这一点,就我而言,我已经完全陷入僵局。
答案 0 :(得分:3)
您可以将它们与以下各项放在一起:
Reduce(rbind, dflist)
给出:
Reads Mean_Val Samp Inst
1 a1 1 samp1 inst1
2 b1 2 samp1 inst1
3 c1 3 samp1 inst1
4 a2 4 samp1 inst2
5 b2 5 samp1 inst2
6 c2 6 samp1 inst2
7 a3 7 samp2 inst1
8 b3 8 samp2 inst1
9 c3 9 samp2 inst1
10 a4 10 samp2 inst2
11 b4 11 samp2 inst2
12 c4 12 samp2 inst2
如果您希望将其放回由样本分隔的数据帧列表中(尽管使用imho可能更容易使用完整的数据帧):
df <- Reduce(rbind, dflist)
split(df, df$Samp)
哪个会给您返回长度为2的列表:
$samp1
Reads Mean_Val Samp Inst
1 a1 1 samp1 inst1
2 b1 2 samp1 inst1
3 c1 3 samp1 inst1
4 a2 4 samp1 inst2
5 b2 5 samp1 inst2
6 c2 6 samp1 inst2
$samp2
Reads Mean_Val Samp Inst
7 a3 7 samp2 inst1
8 b3 8 samp2 inst1
9 c3 9 samp2 inst1
10 a4 10 samp2 inst2
11 b4 11 samp2 inst2
12 c4 12 samp2 inst2
祝你好运!