R删除两个稀疏矩阵的重复元素

时间:2019-09-15 13:16:06

标签: r matrix

我们使用库Matrix来获得一个稀疏矩阵:

library(Matrix)
M = sparseMatrix(i = uidx,j = midx,x = freq)

假设矩阵M如下:

i   j   x
1   2   0.2
1   3   0.3
1   15  0.15
2   7   0.1
...
280 2   0.6
281 7   0.25

经过一些计算,我们得到了另一个稀疏矩阵Q

i   j   x
1   2   18
1   4   16
1   9   8
2   10  19
...

我想使用Q作为基本矩阵,并从Q中删除M中已经存在的那些(i,j)
类似于设定的负数:

Q-M

在我的示例中,它将带来如下结果:

i   j   x
1   4   16
1   9   8    
...
#we have  1  2  18 in original Q but 1  2  0.2 with same index (1,2) already exists in M so remove that row from Q.

任何有效的方法或现有功能来完成这项工作吗?
要重现这种情况,您可以运行以下代码:

library(Matrix)
M = sparseMatrix(i = c(1,1,1),j = c(2,3,15),x = c(0.2,0.3,0.15))
Q = sparseMatrix(i = c(1,1,1),j = c(2,4,9),x = c(18,16,8))
#result should produce a sparse matrix like:
#R = sparseMatrix(i = c(1,1),j = (4,9),x = c(16,8))

1 个答案:

答案 0 :(得分:0)

加载Matrix软件包后,可以使用summary函数到达那里。这给出了稀疏矩阵的完整概述(并将其保留为稀疏矩阵)。基于此,您可以直接比较值。并选择您可以将它们彼此比较。我对示例进行了扩展,以检查是否按预期方式保留/删除了其他值。结果符合您对R矩阵的期望。

library(Matrix)
M = sparseMatrix(i = c(1,1,1,1, 2, 2),
                 j = c(2,3,15, 16, 4, 8),
                 x = c(0.2,0.3,0.15, 0.16, 0.2, 0.08))
Q = sparseMatrix(i = c(1,1,1,1, 2),
                 j = c(2,4,9,16, 4),
                 x = c(18,16,8,50, 40))

#result should produce a sparse matrix like:
R = sparseMatrix(i = c(1,1), 
                 j = c(4,9), 
                 x = c(16, 8))


# creates a summary of the sparse matrices (summary is coming from Matrix)
summary_m <- summary(M)
summary_q <- summary(Q)

# which records to remove
# all records where i and j match (TRUE). Exclude x value in matching comparison.
# summed this should be 2.
# which shows which records are equal and should be removed.
remove <- which(rowSums(summary_m[, c("i", "j") ] == summary_q[, c("i", "j") ]) == 2)

# build summary sparse matrix from summary_q to keep all Q records that do not match M
q_left <- summary_q[-remove, ]

# build full sparse matrix
result <- sparseMatrix(i = q_left$i, j = q_left$j, x = q_left$x)

identical(result, R)
[1] TRUE