如果两列相互匹配,则将每个值添加到新列中-R

时间:2019-07-24 07:46:43

标签: r dataframe match

我有以下两个数据框。

1)
    reference   exchange_destionation
    1234          ""
    1235          ""
    1236          ""
    1237          ""

2) order_id_parent    exchange_destionation
   1234          XMAD
   1234          XPAR
   1236          XMAD
   1237          XPAR

结果数据框应如下所示:

3)
        reference   exchange_destionation
        1234          "XMAD" "XPAR"
        1235          ""
        1236          "XMAD"
        1237          "XPAR"

我要做的是将第一个数据帧“参考”列与第二个数据帧的“ Source_ID”列相匹配,如果来自exchange_destination的事件不止一次,请将其添加到我的第一个数据帧列exchange_destination中。

使用以下代码,我从第二个数据帧中获得了我想要的东西,但是只有一次。

 rs.ord$exchange_destination = rs.ord.hijas$exchange_destination[match(rs.ord$reference,rs.ord.hijas$order_id_parent)]

这是我获得的结果,其中缺少参考1234中的“ XPAR”。

3)
        Reference   Exchange_Dest
        1234          "XMAD" 
        1235          ""
        1236          "XMAD"
        1237          "XPAR"

2 个答案:

答案 0 :(得分:3)

我们可以使用aggregate + merge

aggregate(exchange_destionation.y~reference, 
     merge(df1, df2, by.x = "reference", by.y = "order_id_parent", all = TRUE),
      toString, na.action =  na.pass)

#  reference exchange_destionation.y
#1      1234              XMAD, XPAR
#2      1235                      NA
#3      1236                    XMAD
#4      1237                    XPAR

可以在dplyr中写为

library(dplyr)
full_join(df1, df2, c("reference" = "order_id_parent")) %>%
    group_by(reference)  %>%
    summarise(exchange_dest = toString(exchange_destionation.y))

数据

df1 <- structure(list(reference = 1234:1237, exchange_destionation = 
c(NA, NA, NA, NA)), class = "data.frame", row.names = c(NA, -4L))

df2 <- structure(list(order_id_parent = c(1234L, 1234L, 1236L, 1237L
), exchange_destionation = structure(c(1L, 2L, 1L, 2L), .Label = 
c("XMAD", "XPAR"), class = "factor")), class = "data.frame", row.names = c(NA, -4L))

答案 1 :(得分:3)

使用sapply

df1$exchange_destionation = 
  sapply(df1$reference,function(x){paste(df2$exchange_destionation[df2$order_id_parent%in%x],collapse=" ")})

> df1
  reference exchange_destionation
2      1234             XMAD XPAR
3      1235                      
4      1236                  XMAD
5      1237                  XPAR

数据

> dput(df1)
structure(list(reference = c("1234", "1235", "1236", "1237"), 
    exchange_destionation = c("XMAD XPAR", "", "XMAD", "XPAR"
    )), row.names = 2:5, class = "data.frame")
> dput(df2)
structure(list(order_id_parent = c("1234", "1234", "1236", "1237"
), exchange_destionation = c("XMAD", "XPAR", "XMAD", "XPAR")), row.names = 2:5, class = "data.frame")