我有以下两个数据框。
1)
reference exchange_destionation
1234 ""
1235 ""
1236 ""
1237 ""
2) order_id_parent exchange_destionation
1234 XMAD
1234 XPAR
1236 XMAD
1237 XPAR
结果数据框应如下所示:
3)
reference exchange_destionation
1234 "XMAD" "XPAR"
1235 ""
1236 "XMAD"
1237 "XPAR"
我要做的是将第一个数据帧“参考”列与第二个数据帧的“ Source_ID”列相匹配,如果来自exchange_destination的事件不止一次,请将其添加到我的第一个数据帧列exchange_destination中。
使用以下代码,我从第二个数据帧中获得了我想要的东西,但是只有一次。
rs.ord$exchange_destination = rs.ord.hijas$exchange_destination[match(rs.ord$reference,rs.ord.hijas$order_id_parent)]
这是我获得的结果,其中缺少参考1234中的“ XPAR”。
3)
Reference Exchange_Dest
1234 "XMAD"
1235 ""
1236 "XMAD"
1237 "XPAR"
答案 0 :(得分:3)
我们可以使用aggregate
+ merge
aggregate(exchange_destionation.y~reference,
merge(df1, df2, by.x = "reference", by.y = "order_id_parent", all = TRUE),
toString, na.action = na.pass)
# reference exchange_destionation.y
#1 1234 XMAD, XPAR
#2 1235 NA
#3 1236 XMAD
#4 1237 XPAR
可以在dplyr
中写为
library(dplyr)
full_join(df1, df2, c("reference" = "order_id_parent")) %>%
group_by(reference) %>%
summarise(exchange_dest = toString(exchange_destionation.y))
数据
df1 <- structure(list(reference = 1234:1237, exchange_destionation =
c(NA, NA, NA, NA)), class = "data.frame", row.names = c(NA, -4L))
df2 <- structure(list(order_id_parent = c(1234L, 1234L, 1236L, 1237L
), exchange_destionation = structure(c(1L, 2L, 1L, 2L), .Label =
c("XMAD", "XPAR"), class = "factor")), class = "data.frame", row.names = c(NA, -4L))
答案 1 :(得分:3)
使用sapply
df1$exchange_destionation =
sapply(df1$reference,function(x){paste(df2$exchange_destionation[df2$order_id_parent%in%x],collapse=" ")})
> df1
reference exchange_destionation
2 1234 XMAD XPAR
3 1235
4 1236 XMAD
5 1237 XPAR
数据
> dput(df1)
structure(list(reference = c("1234", "1235", "1236", "1237"),
exchange_destionation = c("XMAD XPAR", "", "XMAD", "XPAR"
)), row.names = 2:5, class = "data.frame")
> dput(df2)
structure(list(order_id_parent = c("1234", "1234", "1236", "1237"
), exchange_destionation = c("XMAD", "XPAR", "XMAD", "XPAR")), row.names = 2:5, class = "data.frame")