相互交叉检查*多个*向量,以测试所有向量中是否存在所有元素

时间:2019-06-28 17:24:52

标签: r vector

我有多个必须包含相同元素的集合(向量)。 从理论上讲,他们应该这样做,但实际上,我对此表示怀疑 一些向量缺少某些元素。

我为这类问题所见过的所有方法都解决了2向量问题,但不适用于多个向量。

简而言之,我要寻找的是将setequal()应用于多个向量,因为唯一性和顺序对我都不重要。

这是一个例子:

#Six sets of characters&numbers that are pretty similar, though not identical
vec_a <- unlist(strsplit("Z d 5 A P y 4 R 6 y w u N T b", split=" "))
vec_b <- unlist(strsplit("Z d 5 B P y 4 R 6 y w u N T b", split=" "))
vec_c <- unlist(strsplit("Z d 5 A P y 4 R 6 y w u N T b", split=" "))
vec_d <- unlist(strsplit("Z d 5 A P x 4 R 6 y w u N W b", split=" "))
vec_e <- unlist(strsplit("Z d 5 A P y 4 R 6 y w u N T b", split=" "))
vec_f <- unlist(strsplit("Z d 5 A P y 4 R 6 y w u N T b", split=" "))

#I want to cross check all 6 sets against each other, 
#to see whether all elements appear in all sets (order doesn't matter, nor uniquness), 
#OR whether some elements DON'T exist in some of the sets. I'd like
#to flag the elements that don't appear in all 6 sets.


#As a start, I just want to get a TRUE/FALSE answer to whether
#all elements appear in all 6 vectors.
Reduce(setequal, list(vec_a, vec_b, vec_c, vec_d, vec_e, vec_f))
[1] FALSE

#It DOES make sense to get that FALSE returned, because 
#not all 6 vectors are the same. 
#HOWEVER, note that vec_a, vec_e, and vec_f ARE IDENTICAL, 
#but when running the following command, I still get FALSE, which doesn't make sense.
Reduce(setequal, list(vec_a, vec_e, vec_f))
[1] FALSE
#So this method clearly doesn't work accurately. 

有什么想法吗?

谢谢!

1 个答案:

答案 0 :(得分:1)

您可以通过以下方式找到所有 common 元素:

l <- list(vec_a, vec_b, vec_c, vec_d, vec_e, vec_f)
( common <- Reduce(intersect, l) )
#  [1] "Z" "d" "5" "P" "y" "4" "R" "6" "w" "u" "N" "b"

(记住,您可能希望它们存储在list中,而不是单个矢量,但是一如既往地取决于您的总体项目/应用程序。)

要查找每个向量与该公共列表有何不同:

lapply(l, setdiff, common)
# [[1]]
# [1] "A" "T"
# [[2]]
# [1] "B" "T"
# [[3]]
# [1] "A" "T"
# [[4]]
# [1] "A" "x" "W"
# [[5]]
# [1] "A" "T"
# [[6]]
# [1] "A" "T"

(在此示例中,如果将list的元素命名为更好的名称,以便您知道是哪个...,那么您可能希望在生成{{1 }}列表。)

您可以找到与哪些相同:

l