R给定一列,在data.frame中查找此列的重复项

时间:2013-11-04 14:12:10

标签: r

鉴于以下数据:

questionTagMatrix <- data.frame( question1=c("0","1","0"), question2=c("1","0", "0"), question3=c("0","1","0"), question4=c("0","1","1")  )
rownames(questionTagMatrix)[1] <- "php"
rownames(questionTagMatrix)[2] <- "html"
rownames(questionTagMatrix)[3] <- "javascript"

newQuestion <- data.frame( newquestion=c("0","1","0") )
rownames(newQuestion)[1] <- "php"
rownames(newQuestion)[2] <- "html"
rownames(newQuestion)[3] <- "javascript"

如何找到questionTagMatrix的所有列等于newQuestion

2 个答案:

答案 0 :(得分:2)

您可以使用apply查找列:

questionTagMatrix[apply(questionTagMatrix, 2, function(x) 
                                               all(x == as.matrix(newQuestion)))]

questionTagMatrix的所有列都与newQuestion进行比较。结果:

#            question1 question3
# php                0         0
# html               1         1
# javascript         0         0

答案 1 :(得分:0)

使用colSums的矢量化解决方案:

 questionTagMatrix[,colSums(questionTagMatrix == newQuestion)
                    ==nrow(questionTagMatrix)]

          question1 question3
php                0         0
html               1         1
javascript         0         0

PS newQuestion是一个向量:

newQuestion =c("0","1","0") ## not data.frame( newquestion=c("0","1","0") )

只获得问题名称:

names(questionTagMatrix)[colSums(questionTagMatrix == newQuestion)
+                   ==nrow(questionTagMatrix)]
[1] "question1" "question3"