Question

我有2个以下结构的数据框（A和B）：

答：

projectID    offerID
   20          12
   20          17 
   32          12
   32          25

B：

 projectID    offerID
   20          12
   20          17 
   32          12

并且我想检查A中但不在B中的对。所以在我的例子中，我想获得包含A中但不在B中的对的新df：

projectID    offerID
   32           25

我尝试了一些选择;例如：

APairs <- A %>% group_by(projectID, offerID)
BPairs <- B %>% group_by(projectID, offerID)

!(APairs %in% BPairs)

但是我得到了真/假结果，我无法理解/验证我的数据。

我们将非常感谢您的帮助！

Answer 1

在base R：

中

#define the key columns in the case of different structure between A and B
cols<-c("projectID","offerID")
A[!do.call(paste,A[cols]) %in% do.call(paste,B[cols]),]
#  projectID offerID
#4        32      25

Answer 2

library(data.table)
setkey(setDT(A))
setkey(setDT(B))
A[!B]                # A[B] is similar to merge() so perform the opposite using !
#   projectID offerID
#1:        32      25

#incase there are extra columns in any of the table, the specify the common columns in a vector
common.col <- c("projectID", "offerID")
setkeyv(setDT(A), cols = common.col)
setkeyv(setDT(B), cols = common.col)
A[!B]

Answer 3

我们可以使用anti_join

中的dplyr

 library(dplyr)
 anti_join(A, B)
 #    projectID offerID
 #1        32      25

如果列数更多，请指定by选项

 anti_join(A, B, by = c("projectID", "offerID"))
 #    projectID offerID
 #1        32      25

检查两个数据帧之间的项目对

3 个答案: