加入包括相互对的数据帧

时间:2016-07-12 09:53:16

标签: r join dataframe

我希望将两个数据帧加入两个共同的列,但我不希望相互对被视为重复。

示例数据框如下所示:

>df
letter1 letter2 value
 d       e     1
 c       d     2
 c       e     4

>dc
letter1 letter2
 a       e
 c       a
 c       d
 c       e
 d       a
 d       c
 d       e
 e       a

我希望通过前两列加入它们,在第三列中保留df $ value中的值,如果df中不存在该行则保留NA。我试过了:

s <- join(dc,df, by = c("letter1","letter2"))

>s
letter1 letter2 value
a        e       NA
c        a       NA
c        d       2
c        e       4
d        a       NA
d        c       2
d        e       1
e        a       NA

此处,d c对被认为与c d相同,第三列中的值相同。我想要的是d c在df中被认为是不存在的,因此它们的行值是NA。我想要的输出是:

>s
letter1 letter2 value
a        e       NA
c        a       NA
c        d       2
c        e       4
d        a       NA
d        c       NA
d        e       1
e        a       NA

我如何加入数据框,以便将相互配对视为不同的组合?

更新:对不起,我刚刚意识到我的输入数据帧存在问题,我尝试的连接线确实有效。我会接受第一个答案,也可以归功于作者。

3 个答案:

答案 0 :(得分:1)

我们可以使用apply更改订单

 df[1:2] <- t(apply(df[1:2], 1, sort))
 dc <- t(apply(dc, 1, sort)

然后执行join

答案 1 :(得分:0)

您可以使用merge代替join

merge(dc,df, by = c("letter1","letter2"),all=TRUE)

答案 2 :(得分:0)

#Creating the data frames
df <- data.frame(letter1=c("d","c","c"),
                 letter2=c("e","d","e"),
                 value=c(1,2,4))

dc <- data.frame(letter1=c("a","c","c","c","d","d","d","e"),
                 letter2=c("e","a","d","e","a","c","e","a"))

# Merging the data frames
dout <- merge(df,dc,by=c("letter1","letter2"),all=T)

# Outcome
letter1 letter2 value
1       c       d     2
2       c       e     4
3       c       a    NA
4       d       e     1
5       d       a    NA
6       d       c    NA
7       a       e    NA
8       e       a    NA