Question

我有两个矩阵，

> A
      [,1] [,2] [,3] [,4] [,5]
 [1,]    1    1    1    2    3
 [2,]    6    1    2    3    2
 [3,]    8    1    1    3    2
 [4,]    3    1    1    1    1
 [5,]    7    2    1    2    2
 [6,]    4    2    2    2    1
 [7,]    5    2    2    3    2
 [8,]    2    2    1    2    3
 [9,]    9    1    1    3    3

和

> B
      [,1] [,2] [,3] [,4] [,5]
 [1,]    7    2    1    2    2
 [2,]    4    2    2    2    1
 [3,]    5    2    2    3    2

矩阵B包含在A中（A的第5、6、7行）。我正在尝试编写代码以从A删除B的行，即生成这样的矩阵：

      [,1] [,2] [,3] [,4] [,5]
 [1,]    1    1    1    2    3
 [2,]    6    1    2    3    2
 [3,]    8    1    1    3    2
 [4,]    3    1    1    1    1
 [5,]    2    2    1    2    3
 [6,]    9    1    1    3    3

Answer 1

一种方法是使用dplyr中的anti_join：

require(dplyr) 
as.matrix(anti_join(as.data.frame (A), as.data.frame (B)))

这假定您要从A中删除出现在B中的各个行。如果您只想删除整个B矩阵的实例，则该方法将无效。

===

让我们尝试基于Julius出色的解决方案的另一种选择（如果您选择使用此解决方案，请接受Julius的解决方案而不是此解决方案-我只是想看看我是否可以对此进行改进）：

首先，让我们创建一个更复杂的示例：

A<-structure(c(1, 6, 8, 3, 7, 4, 5, 2, 9, 7, 4, 5, 1, 2, 2, 3, 1, 
1, 1, 1, 2, 2, 2, 2, 1, 2, 2, 2, 1, 2, 1, 2, 1, 2, 1, 1, 1, 2, 
2, 1, 1, 1, 2, 2, 7, 4, 5, 1, 2, 3, 3, 1, 2, 2, 3, 2, 3, 2, 2, 
3, 2, 2, 2, 1, 3, 2, 2, 1, 2, 1, 2, 3, 3, 2, 1, 2, 1, 2, 2, 1
), .Dim = c(16L, 5L))

此矩阵有两个B的完整实例，另一个B的实例从第13行的中间开始，到第16行的中间结束。

      [,1] [,2] [,3] [,4] [,5]
 [1,]    1    1    1    2    3
 [2,]    6    1    2    3    2
 [3,]    8    1    1    3    2
 [4,]    3    1    1    1    1
 [5,]    7    2    1    2    2
 [6,]    4    2    2    2    1
 [7,]    5    2    2    3    2
 [8,]    2    2    1    2    3
 [9,]    9    1    1    3    3
[10,]    7    2    1    2    2
[11,]    4    2    2    2    1
[12,]    5    2    2    3    2
[13,]    1    1    7    2    1
[14,]    2    2    4    2    2
[15,]    2    1    5    2    2
[16,]    3    2    1    1    1

现在让我们尝试修改Julius的解决方案，以便删除B的两个“真实”实例，但不删除中线出现的“偶然”实例：

aux <- function(m) paste0(c(t(m)), collapse = "") # Julius's original aux function
locsaux<-gregexpr(aux(B),aux(A))[[1]] #obtaining the locations of instances of B within A
locations<-((locsaux[(locsaux-1)%%ncol(A)==0]-1)/ncol(A))+1 # first, making sure the locations occur at the start of a line, then translating them to row number in A
A[-unlist(sapply(locations, function(x) seq(x,x+nrow(B)-1),simplify=FALSE)),] # removing the rows identified in previous lines plus the subsequent n-1 rows, where n is the number of rows in B

结果：

      [,1] [,2] [,3] [,4] [,5]
 [1,]    1    1    1    2    3
 [2,]    6    1    2    3    2
 [3,]    8    1    1    3    2
 [4,]    3    1    1    1    1
 [5,]    2    2    1    2    3
 [6,]    9    1    1    3    3
 [7,]    1    1    7    2    1
 [8,]    2    2    4    2    2
 [9,]    2    1    5    2    2
[10,]    3    2    1    1    1

成功！

Answer 2

假设A和B的列数相同，并且B中只有A的单个匹配项，

aux <- function(m) paste0(c(t(m)), collapse = "")
A[-1:-nrow(B) - (gregexpr(aux(B), aux(A))[[1]] - 1)[1] / (2 * ncol(A)), ]
#      [,1] [,2] [,3] [,4] [,5]
# [1,]    1    1    1    2    3
# [2,]    6    1    2    3    2
# [3,]    8    1    1    3    2
# [4,]    3    1    1    1    1
# [5,]    2    2    1    2    3
# [6,]    9    1    1    3    3

此处aux将矩阵转换为字符串，例如，

aux(B)
# [1] "7,2,1,2,2,4,2,2,2,1,5,2,2,3,2"

和gregexpr(aux(B), aux(A))在B中找到A。其余的将在aux(B)中标识的aux(A)的位置转换为要删除的A的行。

也就是说，这从整个B中删除了矩阵A。如果B在A某行的中间开始，则会出现问题。

删除另一个包含的矩阵

2 个答案: