我使用了一些变量,但是当它被使用时,我再也不需要了它,所以我需要删除它并释放内存,但函数rm()似乎没有帮助:
memory.size()
30.69
tmp=matrix(rnorm(6e5*20),6e5,20)
memory.size()
207.64
rm(tmp)
memory.size()
207.64
这是否意味着我删除了tmp但是没有释放内存?
答案 0 :(得分:43)
我使用gc()
在操作之间释放RAM。下面是我在循环中如何使用它的示例,但有关gc()
和here的详细讨论,请参阅here以获取有关R会话期间内存管理的更多信息。
# load library
library(topicmodels)
# get data
data("AssociatedPress"))
# set number of topics to start with
k <- 20
# set model options
control_LDA_VEM <-
list(estimate.alpha = TRUE, alpha = 50/k, estimate.beta = TRUE,
verbose = 0, prefix = tempfile(), save = 0, keep = 0,
seed = as.integer(100), nstart = 1, best = TRUE,
var = list(iter.max = 10, tol = 10^-6),
em = list(iter.max = 10, tol = 10^-4),
initialize = "random")
# create the sequence that stores the number of topics to
# iterate over
sequ <- seq(20, 300, by = 20)
# basic loop to iterate over different topic numbers with gc
# after each run to empty out RAM
lda <- vector(mode='list', length = length(sequ))
for(k in sequ) {
lda[[k]] <- LDA(AssociatedPress[1:20,], k, method= "VEM", control = control_LDA_VEM)
gc() # here's where I put the garbage collection to free up memory before the next round of the loop
}
# convert list output to dataframe (suggestions for a simpler method are welcome!)
best.model.logLik <- data.frame(logLik = as.matrix(lapply(lda[sequ], logLik)), ntopic = sequ)
# plot
with(best.model.logLik, plot(ntopic, logLik, type = 'l', xlab="Number of topics", ylab="Log likelihood"))
# print ordered dataframe to see which number of topics has the highest log likelihood
(best.model.logLik.sort <- best.model.logLik[order(-as.numeric(best.model.logLik$logLik)), ])
logLik ntopic
2 -17904.12 40
3 -18105.48 60
1 -18181.84 20
4 -18569.7 80
5 -19736.94 100
6 -21919.6 120
7 -23785.08 140
8 -24914.23 160
9 -25493.76 180
10 -25837.64 200
11 -25964.23 220
12 -26061.01 240
13 -26117.92 260
14 -26149.44 280
15 -26168.91 300