Question

我有一个巨大的矩阵，其值为1,2或3（以及一些NA）。如果矩阵是n×m，那么我必须重新编码为n×3m，原始矩阵的每个值对应于新矩阵的3个条目。如果旧矩阵中的值为x，则第x个条目将为1，而其他两个将为零（如果NA全部为零）。

1, 3,  NA, 1

重新编码为

1 0 0 0 0 1 0 0 0 1 0 0

即

我必须在R中有效地做到这一点，因为矩阵是巨大的。最有效的方法是什么？矩阵在data.table中。

Answer 1

使用预先分配的空矩阵。

mat <- matrix(c(1,3,NA,1,1,3,NA,1),nrow=2,byrow=TRUE)
mat

#     [,1] [,2] [,3] [,4]
#[1,]    1    3   NA    1
#[2,]    1    3   NA    1

newmat <- matrix(0, ncol=ncol(mat)*3, nrow=nrow(mat))
ind <- cbind(rep(1:nrow(mat),ncol(mat)), as.vector(mat + (col(mat)*3-3))) 
newmat[ind] <- 1

newmat
#     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
#[1,]    1    0    0    0    0    1    0    0    0     1     0     0
#[2,]    1    0    0    0    0    1    0    0    0     1     0     0

您也可以将此方法与Matrix包中的稀疏矩阵一起使用。

library(Matrix)
newmat <- Matrix(0, ncol=ncol(mat)*3, nrow=nrow(mat),sparse=TRUE)
newmat[ind[complete.cases(ind),]] <- 1

newmat 
#2 x 12 sparse Matrix of class "dgCMatrix"
#                            
#[1,] 1 . . . . 1 . . . 1 . .
#[2,] 1 . . . . 1 . . . 1 . .

使用稀疏矩阵有许多优点，包括显着减少内存使用。

在R中重新编码巨大的矩阵

1 个答案: