我有一个像这样的数据框:
Type Sample Version
C1 A 2
C1 A 4
C1 A 6
C1 B 3
C1 B 5
C1 B 7
C1 C 1
C1 C 3
C1 C 5
C1 D 0
C1 D 0
C1 D 0
C1 D 0
C1 D 0
C1 D 0
C1 D 0
C1 D 0
. . .
C3 A 2
C3 A 4
C3 A 6
C3 B 3
C3 B 5
C3 B 7
C3 C 1
C3 C 3
C3 C 5
C3 D 0
C3 D 0
C3 D 0
C3 D 0
C3 D 0
C3 D 0
C3 D 0
C3 D 0
我想为每个样本重复D的8行(除D本身以外的A,B,C),并在复制的行Sample D中替换为已复制到的相应Sample。
基本上,它将D的行添加到A,B和C。将D重命名为A,B和C。从数据框中删除D。
最终数据框如下所示:
Type Sample Version
C1 A 2
C1 A 4
C1 A 6
C1 A 0
C1 A 0
C1 A 0
C1 A 0
C1 A 0
C1 A 0
C1 A 0
C1 A 0
C1 B 3
C1 B 5
C1 B 7
C1 B 0
C1 B 0
C1 B 0
C1 B 0
C1 B 0
C1 B 0
C1 B 0
C1 B 0
C1 C 1
C1 C 3
C1 C 5
C1 C 0
C1 C 0
C1 C 0
C1 C 0
C1 C 0
C1 C 0
C1 C 0
C1 C 0
我想出了如何通过遍历Type和Sample来实现这一点:
for(i in unique(dataframe$Type)){
for(j in unique(dataframe$Sample)){
tmp <- dataframe[which(dataframe$Type == i & dataframe$Sample == "D"),]
tmp$Sample <- j
dataframe <- rbind(dataframe, tmp)
}
}
dataframe <- dataframe[which(dataframe$Sample != "D"),]
使用dplyr一定有更好的方法吗?
更新:我修改了最终数据框以使其更接近实际情况,尽管它仍然是简化的示例。
答案 0 :(得分:0)
有了data.table,有...
library(data.table)
setDT(DF)
DF[, {
dd = .SD[Sample == "D", !"Sample"]
.SD[Sample != "D", rbind(.SD, dd, fill=TRUE), by=Sample]
}, by=.(Type)]
给出
Type Sample Version
1: C1 A 2
2: C1 A 4
3: C1 A 6
4: C1 A 0
5: C1 A 0
6: C1 A 0
7: C1 A 0
8: C1 A 0
9: C1 A 0
10: C1 A 0
11: C1 A 0
12: C1 B 3
13: C1 B 5
14: C1 B 7
15: C1 B 0
16: C1 B 0
17: C1 B 0
18: C1 B 0
19: C1 B 0
20: C1 B 0
21: C1 B 0
22: C1 B 0
23: C1 C 1
24: C1 C 3
25: C1 C 5
26: C1 C 0
27: C1 C 0
28: C1 C 0
29: C1 C 0
30: C1 C 0
31: C1 C 0
32: C1 C 0
33: C1 C 0
Type Sample Version