Question

我有一个矩阵，我想要将某些特定元素归零。

例如，假设我的矩阵是：

m <- matrix(1:100, ncol=10)

然后我有两个向量指示要保留哪些元素

m.from <- c(2, 5, 4, 4, 6, 3, 1, 4, 2, 5)
m.to   <- c(7, 9, 6, 8, 9, 5, 6, 8, 4, 8)

因此，例如，我将在第1行保留元素3：6，并将元素1：2和7:10设置为0。对于第2行，我将保持6：8，其余为零，依此类推。

现在，我可以轻松地做到：

for (line in 1:nrow(m))
    {
    m[line, 1:m.from[line]] <- 0
    m[line, m.to[line]:ncol(m)] <- 0
    }

给出了正确的结果。

然而，在我的特定情况下，我使用~15000 x 3000矩阵进行操作，这使得使用这种环路的时间非常长。

如何加快此代码的速度？我虽然使用apply，但如何访问m.from和m.to的正确索引？

Answer 1

这是一个简单的面向矩阵的解决方案：

m[col(m) <= m.from] <- 0
m[col(m) >= m.to] <- 0
m
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]    0    0   21   31   41   51    0    0    0     0
 [2,]    0    0    0    0    0   52   62   72    0     0
 [3,]    0    0    0    0   43    0    0    0    0     0
 [4,]    0    0    0    0   44   54   64    0    0     0
 [5,]    0    0    0    0    0    0   65   75    0     0
 [6,]    0    0    0   36    0    0    0    0    0     0
 [7,]    0   17   27   37   47    0    0    0    0     0
 [8,]    0    0    0    0   48   58   68    0    0     0
 [9,]    0    0   29    0    0    0    0    0    0     0
[10,]    0    0    0    0    0   60   70    0    0     0

（我想我也可能在这个上赢得R高尔夫奖。）我的参赛作品将是：

m[col(m)<=m.from|col(m)>= m.to]<-0

Answer 2

最佳解决方案是预先计算要替换的所有索引，然后用单个赋值操作替换它们。

由于R将矩阵存储在column-major order中，因此我发现在矩阵的转置版本中更容易考虑要替换的元素序列。这就是我在下面使用的内容。但是，如果对t()的两次调用过于昂贵，我相信你可以找出一种聪明的方法来计算未转换矩阵的索引 - 也许使用包含行和列索引的两列矩阵。

## Your example
m <- matrix(1:100, ncol=10)
m.from <- c(2, 5, 4, 4, 6, 3, 1, 4, 2, 5)
m.to   <- c(7, 9, 6, 8, 9, 5, 6, 8, 4, 8)

## Let's work with a transposed version of your matrix
tm <- t(m)

## Calculate indices of cells to be replaced
i <- (seq_len(ncol(tm)) - 1) * nrow(tm)
m.to   <- c(1, m.to + i)
m.from <- c(m.from + i, length(m))
ii <- unlist(mapply(seq, from = m.to, to = m.from))

## Perform replacement and transpose back results
tm[ii] <- 0
m <- t(tm)
#       [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#  [1,]    0    0   21   31   41   51    0    0    0     0
#  [2,]    0    0    0    0    0   52   62   72    0     0
#  [3,]    0    0    0    0   43    0    0    0    0     0
#  [4,]    0    0    0    0   44   54   64    0    0     0
#  [5,]    0    0    0    0    0    0   65   75    0     0
#  [6,]    0    0    0   36    0    0    0    0    0     0
#  [7,]    0   17   27   37   47    0    0    0    0     0
#  [8,]    0    0    0    0   48   58   68    0    0     0
#  [9,]    0    0   29    0    0    0    0    0    0     0
# [10,]    0    0    0    0    0   60   70    0    0     0

Answer 3

sapply版本。

m <- matrix(1:100, ncol=10)
m.from <- c(2, 5, 4, 4, 6, 3, 1, 4, 2, 5)
m.to   <- c(7, 9, 6, 8, 9, 5, 6, 8, 4, 8)

t(sapply(1:nrow(m), function(i) replace(m[i,], c(1:m.from[i], m.to[i]:ncol(m)), 0 )))   



     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]    0    0   21   31   41   51    0    0    0     0
 [2,]    0    0    0    0    0   52   62   72    0     0
 [3,]    0    0    0    0   43    0    0    0    0     0
 [4,]    0    0    0    0   44   54   64    0    0     0
 [5,]    0    0    0    0    0    0   65   75    0     0
 [6,]    0    0    0   36    0    0    0    0    0     0
 [7,]    0   17   27   37   47    0    0    0    0     0
 [8,]    0    0    0    0   48   58   68    0    0     0
 [9,]    0    0   29    0    0    0    0    0    0     0
[10,]    0    0    0    0    0   60   70    0    0     0

经过时间尚未测试

Answer 4

此选项构造一个要替换的双列矩阵索引元素，并且不需要矩阵转置，因此应难以击败，速度

## Your data
m <- matrix(1:100, ncol=10)
m.from <- c(2, 5, 4, 4, 6, 3, 1, 4, 2, 5)
m.to   <- c(7, 9, 6, 8, 9, 5, 6, 8, 4, 8)

## Construct a two column matrix with row (ii) and column (jj) indices
## of cells to be replaced
ii <- rep.int(1:ncol(m), times = (m.from + (ncol(m) - m.to + 1)))
jj <- mapply(seq, from = m.from + 1, to = m.to - 1)
jj <- unlist(sapply(jj, function(X) setdiff(1:10,X)))
ij <- cbind(ii, jj)

## Replace cells
m[ij] <- 0
#       [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#  [1,]    0    0   21   31   41   51    0    0    0     0
#  [2,]    0    0    0    0    0   52   62   72    0     0
#  [3,]    0    0    0    0   43    0    0    0    0     0
#  [4,]    0    0    0    0   44   54   64    0    0     0
#  [5,]    0    0    0    0    0    0   65   75    0     0
#  [6,]    0    0    0   36    0    0    0    0    0     0
#  [7,]    0   17   27   37   47    0    0    0    0     0
#  [8,]    0    0    0    0   48   58   68    0    0     0
#  [9,]    0    0   29    0    0    0    0    0    0     0
# [10,]    0    0    0    0    0   60   70    0    0     0

访问R中的特定范围的矩阵元素

4 个答案: