Question

我在R中有一个稀疏的Matrix对我来说显然太大而无法运行as.matrix()（虽然它也不是超级巨大的）。有问题的as.matrix()调用是在svd()函数内，所以我想知道是否有人知道不需要先转换为密集矩阵的SVD的不同实现。

Answer 1

irlba包具有非常快速的稀疏矩阵SVD实现。

Answer 2

你可以使用随机投影在R中做一个令人印象深刻的稀疏SVD，如http://arxiv.org/abs/0909.4061

中所述

以下是一些示例代码：

# computes first k singular values of A with corresponding singular vectors
incore_stoch_svd = function(A, k) {
  p = 10              # may need a larger value here
  n = dim(A)[1]
  m = dim(A)[2]

  # random projection of A    
  Y = (A %*% matrix(rnorm((k+p) * m), ncol=k+p))
  # the left part of the decomposition works for A (approximately)
  Q = qr.Q(qr(Y))
  # taking that off gives us something small to decompose
  B = t(Q) %*% A

  # decomposing B gives us singular values and right vectors for A  
  s = svd(B)
  U = Q %*% s$u
  # and then we can put it all together for a complete result
  return (list(u=U, v=s$v, d=s$d))
}

Answer 3

所以这就是我最终做的事情。编写以SVDLIBC的“稀疏文本”格式将稀疏矩阵（类dgCMatrix）转储到文本文件的例程相对简单，然后调用svd可执行文件，并读回三个结果文本文件进入R.

问题在于它的效率非常低 - 我花了大约10秒的时间阅读和阅读写文件，但实际的SVD计算只需约0.2秒左右。不过，这当然比完全无法执行计算要好，所以我很高兴。 =）

Answer 4

rARPACK是您需要的包裹。像魅力一样工作，超快，因为它通过C和C ++并行化。

R中稀疏矩阵的SVD

4 个答案: