Question

我们从：

开始

m1 = matrix(c(1:32), ncol=4, byrow = T); m2 = matrix(c(1:16), ncol=4, byrow=T);

如果不明显，这将产生2个矩阵，一个是8x4，另一个是4x4，这样前者的前4行与后者相同。

我想要一个带有sudo / semi代码的函数;

#x is always the bigger; an if check can be put here but assume nrow(x) > nrow(y)
countAinB<-function(x, y){

#new matrix of 0s that has the same dim of x, add 1 extra column for adding found/not found (0/1) coding
c <-matrix(0, ncol(x)+1, nrow(x))


#need change of for, it is slow in R
for (i in 1:nrow(y)){
    #bad R below
    if(y[i,] in x){
    ??add a 1 to the column matching the found row of y in x to c
}}
return(c)
}
C <- countAinB(M1,M2)

现在C，是一个与X相同的矩阵，除了它有一个0和1的列，表示在M1中找到了M2。

我的真实数据集非常庞大，因此试图找到最佳解决方案。

Answer 1

data.table是解决此类问题的快速解决方案：

library(data.table)
DT1 <- data.table(m1)
DT2 <- data.table(cbind(m2, 0), key=paste0("V", seq(len=ncol(m2))))
setnames(DT2, c(head(names(DT2), -1L), "found"))
DT2[DT1, list(found=ifelse(is.na(found), 0, 1))]

在这里，我们使用每个列的前四列左键加入DT2到DT1。这会产生：

#    V1 V2 V3 V4 found
# 1:  1  2  3  4     1
# 2:  5  6  7  8     1
# 3:  9 10 11 12     1
# 4: 13 14 15 16     1
# 5: 17 18 19 20     0
# 6: 21 22 23 24     0
# 7: 25 26 27 28     0
# 8: 29 30 31 32     0

其中found表示该行是否存在于两个对象中。您可以使用as.matrix转换回矩阵。

R：在另一个矩阵中找到一个矩阵的行

1 个答案: