我有2个以下结构的数据框(A和B):
答:
projectID offerID
20 12
20 17
32 12
32 25
B:
projectID offerID
20 12
20 17
32 12
并且我想检查A中但不在B中的对。所以在我的例子中,我想获得包含A中但不在B中的对的新df:
projectID offerID
32 25
我尝试了一些选择;例如:
APairs <- A %>% group_by(projectID, offerID)
BPairs <- B %>% group_by(projectID, offerID)
!(APairs %in% BPairs)
但是我得到了真/假结果,我无法理解/验证我的数据。
我们将非常感谢您的帮助!
答案 0 :(得分:4)
在base
R:
#define the key columns in the case of different structure between A and B
cols<-c("projectID","offerID")
A[!do.call(paste,A[cols]) %in% do.call(paste,B[cols]),]
# projectID offerID
#4 32 25
答案 1 :(得分:3)
library(data.table)
setkey(setDT(A))
setkey(setDT(B))
A[!B] # A[B] is similar to merge() so perform the opposite using !
# projectID offerID
#1: 32 25
#incase there are extra columns in any of the table, the specify the common columns in a vector
common.col <- c("projectID", "offerID")
setkeyv(setDT(A), cols = common.col)
setkeyv(setDT(B), cols = common.col)
A[!B]
答案 2 :(得分:2)
我们可以使用anti_join
dplyr
library(dplyr)
anti_join(A, B)
# projectID offerID
#1 32 25
如果列数更多,请指定by
选项
anti_join(A, B, by = c("projectID", "offerID"))
# projectID offerID
#1 32 25