Question

我一直坚持使一些数学运算速度加快，并且需要一些帮助。这是问题所在。我从一些n x n矩阵X开始。然后，我对等级为k的部分SVD求矩阵U, D, V。然后，我们获得一个低秩近似Z = U D V^T。

问题：计算Z的平方Frobenius范数，只限于上三角形。也就是说，假设Y是一个n x n矩阵，使得Y_ij = 1如果i < j且为零，那么我想要

其中，花式圆代表元素矩阵的乘积。问题在于，实际上 Z太大而无法容纳内存。

下面是一些R代码来说明问题：

library(Matrix)
library(RSpectra)
library(testthat)

set.seed(27)

n   <- 10000
nnz <- 10000
r   <- 20

M <- rsparsematrix(nrow = n, ncol = n, nnz = nnz)
s <- RSpectra::svds(M, r)

Z <- s$u %*% diag(s$d) %*% t(s$v)

# exclude the diagonal!
Z_ut <- Z * upper.tri(Z, diag = FALSE)

expected <- sum(Z_ut^2)

当前，我正在使用RcppArmadillo进行计算，就像这样：

#include <RcppArmadillo.h>

using namespace arma;

// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
double p_u_f_norm_sq_impl(
    const arma::mat& U,
    const arma::rowvec& d,
    const arma::mat& V) {

  int n = U.n_rows;
  double total = 0;

  arma::mat DVt = diagmat(d) * V.t();

  // i^th row of Z = U D V^T truncated to the portion
  // in the upper triangle
  arma::rowvec Z_i_trunc;

  // norm of the upper triangle excluding diagonal
  for (int i = 0; i < n - 1; i++) {
    Z_i_trunc = U.row(i) * DVt.cols(i + 1, n - 1);
    total += dot(Z_i_trunc, Z_i_trunc);
  }

  return total;
}

然后我们可以测试该实现是否是最新的：

out <- p_u_f_norm_sq_impl(s$u, s$d, s$v)
expect_equal(out, expected)

最后，我将需要重复此操作200次，其中原始数据矩阵X是一个稀疏的200,000 x 200,000矩阵，其中包含大约一百万个条目，其中{{1 }}，同时仍然适合笔记本电脑的内存。

理想情况下，此步骤将比现在快得多。

加快SVD重建上三角的范数计算

0 个答案: