尝试使用lda函数时出现以下错误。我的训练数据集只有54683行,有12个变量。
Error: cannot allocate vector of size 34.8 Gb
In addition: Warning messages:
1: In rep.int(c(1, numeric(n)), n - 1L) :
Reached total allocation of 3066Mb: see help(memory.size)
2: In rep.int(c(1, numeric(n)), n - 1L) :
Reached total allocation of 3066Mb: see help(memory.size)
3: In rep.int(c(1, numeric(n)), n - 1L) :
Reached total allocation of 3066Mb: see help(memory.size)
4: In rep.int(c(1, numeric(n)), n - 1L) :
Reached total allocation of 3066Mb: see help(memory.size)
下面是我的sessionifo()
R version 3.2.3 (2015-12-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] MASS_7.3-45
loaded via a namespace (and not attached):
[1] tools_3.2.3
ram - 3gb 处理器:intel(r)core(tm)2duo t6600 @ 2.20ghz 2.20 ghz
我尝试在线搜索并查看bigmemory和ff包但很难理解。
我对R工作比较陌生。任何帮助都会受到赞赏。
我的代码:
d1=data.frame(read.csv("C:\\Users\\pankaj\\Downloads\\Mc Kinsey Hiring\\train.csv",na.strings = c(""," ")))
View(d1)
#d1$Email_ID = row.names(d1$Email_ID)
View(d1)
d1$Email_Status = as.factor(d1$Email_Status)
d1$Email_Type = as.factor(d1$Email_Type)
d1$Email_Source_Type = as.factor(d1$Email_Source_Type)
d1$Customer_Location = as.factor(d1$Customer_Location)
d1$Email_Campaign_Type = as.factor(d1$Email_Campaign_Type)
d1$Time_Email_sent_Category = as.factor(d1$Time_Email_sent_Category)
summary(d1)
set.seed(1)
sp = sample(x = 68353,size = 54683)
train = d1[sp,]
test = d1[-sp,]
View(train)
View(test)
# removed all unnecessary data from environment. Leaving only the training data
rm(sp)
rm(d1)
rm(test)
# running lda
library(MASS)
lda.fit = lda(train$Email_Status ~. - train$Email_ID,data = train)