我有TextDocumentMatrix tdm
(条款:66779,文档:609551,非稀疏条目:9704315)。我正在尝试使用下面列出的代码处理它:
# 1. counting sum of term values for each document
colTotals = apply(tdm , 2, sum)
# 2. Singular Value Decomposition
s = svd(as.matrix(tdm), nu = nrow(tdm), nv = ncol(tdm))
# 3. Latent Semantic Analysis (with lsa package)
sp = lsa(tdm)
上面列出的每个调用(1,2,3)都会导致错误:
Error in vector(typeof(x$v), nr * nc) : vector size cannot be NA
In addition: Warning message:
In nr * nc : NAs produced by integer overflow
如何处理如此大的矩阵?