我希望在每个数据帧中合并两个数据帧( sibA 和 sibB df1 和 df2 下文)。
sibA = c(1,2,13,4,6)
sibB = c(11,12,3,14,16)
mum = c("aa", "bb", "cc", "dd", NA)
df1 = data.frame(sibA, sibB, mum)
df1
# sibA sibB mum
# 1 1 11 aa
# 2 2 12 bb
# 3 13 3 cc
# 4 4 14 dd
# 5 6 16 <NA>
sibA = c(1,12,3,14,22,23)
sibB = c(11,2,13,4,32,33)
inbredCoeffsibA = c(.1,.12,.3,.14,.22,.23)
inbredCoeffsibB = c(.11,.2,.13,.4,.32,.33)
df2 = data.frame(sibA, sibB, inbredCoeffsibA, inbredCoeffsibB)
df2
# sibA sibB inbredCoeffsibA inbredCoeffsibB
# 1 1 11 0.10 0.11
# 2 12 2 0.12 0.20
# 3 3 13 0.30 0.13
# 4 14 4 0.14 0.40
# 5 22 32 0.22 0.32
# 6 23 33 0.23 0.33
问题在于两个数据框中变量的每个成员的顺序是任意的(例如:兄弟姐妹2-12对话在 df2 中反转,以及3 -13在 df1 中反转。期望的结果数据框:
sibA =c(1,12,3,14,22,23,6)
sibB = c(11,2,13,4,32,33,16)
mum = c("aa", "bb", "cc", "dd", NA, NA,NA)
inbredCoeffsibA= c(.1,.12,.3,.14,.22,.23,NA)
inbredCoeffsibB= c(.11,.2,.13,.4,.32,.33, NA)
desired = data.frame(sibA, sibB, mum, inbredCoeffsibA, inbredCoeffsibB)
desired
# sibA sibB mum inbredCoeffsibA inbredCoeffsibB
# 1 1 11 aa 0.10 0.11
# 2 12 2 bb 0.12 0.20
# 3 3 13 cc 0.30 0.13
# 4 14 4 dd 0.14 0.40
# 5 22 32 <NA> 0.22 0.32
# 6 23 33 <NA> 0.23 0.33
# 7 6 16 <NA> NA NA
(理想情况下,如果变量 mum 是数字,合并也会起作用。)
答案 0 :(得分:3)
您可以使用pmin
和pmax
重新排序密钥,然后使用merge(..., all=TRUE)
:
df1$k1 <- pmin(df1$sibA, df1$sibB)
df1$k2 <- pmax(df1$sibA, df1$sibB)
df2$k1 <- pmin(df2$sibA, df2$sibB)
df2$k2 <- pmax(df2$sibA, df2$sibB)
merge(df1, df2, by=c("k1","k2"), all=TRUE)
k1 k2 sibA.x sibB.x mum sibA.y sibB.y inbredCoeffsibA inbredCoeffsibB
1 1 11 1 11 aa 1 11 0.10 0.11
2 2 12 2 12 bb 12 2 0.12 0.20
3 3 13 13 3 cc 3 13 0.30 0.13
4 4 14 4 14 dd 14 4 0.14 0.40
5 6 16 6 16 <NA> NA NA NA NA
6 22 32 NA NA <NA> 22 32 0.22 0.32
7 23 33 NA NA <NA> 23 33 0.23 0.33