如何在R中的数据框中选择两列?

时间:2019-05-02 16:37:49

标签: r dataframe select

想象一下,我有2个df,分别名为A和B。对于df A的每一行,我想检查B df中是否有相应的行。在下面的示例中,代码仅向我输出一个答案为TRUE,因为df A中的最后一行与df B中的最后一行不匹配。

A <- NULL
B <- NULL
A <- data.frame(A = c('a','b','c','d','e'), B = c('1','2','3','4','5'))
B <- data.frame(A = c('a','b','c','d','f'), B = c('1','2','3','4','5'))

i <- 0
for(i in 1: length(A$A))
{
  point <- A[i,]
  if(!point %in% B[which[1:2]])
    print(TRUE)
}

2 个答案:

答案 0 :(得分:0)

bool = Reduce(paste, A) %in% Reduce(paste, B)
transform(A, msg = c("Absent", "Present")[bool + 1])
#  A B     msg
#1 a 1 Present
#2 b 2 Present
#3 c 3 Present
#4 d 4 Present
#5 e 5  Absent

答案 1 :(得分:0)

您可以检查两个表的反联接是否包含任何行(即,在公共列的两个数据帧之间是否存在不相等的行),如果是,则打印TRUE

if(diff_rows <- nrow(dplyr::anti_join(A, B)) > 0) print(diff_rows)

# Joining, by = c("A", "B")
# [1] TRUE
# Warning message:
# Column `A` joining factors with different levels, coercing to character vector

如果您想忽略警告,则可以清理输出

if(diff_rows <- nrow(suppressWarnings(dplyr::anti_join(A, B, by = names(A)))) > 0) 
  print(diff_rows)

# [1] TRUE