Question

我有两个向量，我想知道向量中的哪些索引不相同。我不确定如何执行此操作，因为NA == NA生成NA而NA == 5也生成NA。有人可以提供指导吗？

# Create data with NA vs. 3
dat1 <- data.frame(foo = c(NA, 5, 9),
                  bar = c(3, 5, 9))

# Create data with NA vs. NA
dat2 <- data.frame(foo = c(NA, 5, 9),
                  bar = c(NA, 5, 9))

# Produces same result
dat1$foo == dat1$bar
dat2$foo == dat2$bar

identical((dat1$foo == dat1$bar), (dat2$foo == dat2$bar))

Answer 1

修改

当我们在两列中都有NA时，以下解决方案无效。为了解决这个问题，我们可以声明一个函数：

dissimilar_index <- function(dat) { ind = with(dat, (foo == bar) | (is.na(foo) & is.na(bar))) which(is.na(ind) | !ind) } dissimilar_index(dat1) #[1] 1 dissimilar_index(dat2) #integer(0)

检查创建新数据框dat3
的功能
dat3 = rbind(dat1, c(2, 3)) dat3 # foo bar #1 NA 3 #2 5 5 #3 9 9 #4 2 3 dissimilar_index(dat3) #[1] 1 4

我们也可以使用，

ind = !with(dat1, is.na(foo) == is.na(bar) & foo == bar) which(!is.na(ind) & ind) #[1] 1 ind = !with(dat2, is.na(foo) == is.na(bar) & foo == bar) which(!is.na(ind) & ind) #integer(0)

在这里，我们检查两列是否NA以及两者是否相等。

原始答案

我们可以获取不相似的列的索引，并为NA添加额外的检查以使用which获取索引。

ind = dat1$foo != dat1$bar which(is.na(ind) | ind) #[1] 1

Answer 2

使用sapply和identical的方法：

non_ident_ind <- function(df) {
    which(!sapply(1:nrow(df), function(i) identical(df$foo[i], df$bar[i])))
}

结果：

non_ident_ind(dat1)
# [1] 1
non_ident_ind(dat2)
# integer(0)

使用apply的另一种方法：

which(apply(dat1, 1, function(r) length(unique(r)) > 1))
# [1] 1
which(apply(dat2, 1, function(r) length(unique(r)) > 1))
# integer(0)

R：识别两个向量中的不相同元素

2 个答案: