我有一个数据集,我需要在R中用一些ML算法进行分析。 这是我的代码:
#data is a big db (351444 obj and 39 variables). It contains physiological data about different users.
library('mlbench')
library('randomForest')
library('nnet')
library("kernlab")
library('caret')
library('rpart')
library('caretEnsemble')
library('pROC')
data$target <- as.factor(emodata$target) #i need to do a classification
set.seed(12)
trainIndex <- createDataPartition(data$target, p = .8,
list = FALSE,
times = 1)
Train <- data[ trainIndex,]
Test <- data[-trainIndex,]
my_control <- trainControl(
method='cv',
number=5,
savePredictions=TRUE,
classProbs=TRUE,
verboseIter=TRUE,
allowParallel=TRUE,
index=createResample(Train$target),
summaryFunction=twoClassSummary
)
model_list <- caretList(
target~., data=Train,
trControl=my_control,
metric='ROC',
# methodList=c('glm', 'rpart'),
tuneList=list(
svmLinear=caretModelSpec(method='svmLinear',tuneGrid=expand.grid(.C=c(0.01,0.1,1,3,5,10,20))),
svmPoly=caretModelSpec(method='svmPoly',tuneGrid=expand.grid(.degree=(2:5), .scale=.1, .C=c(0.01,0.1,1,3,5,10,20))),
svmRadial=caretModelSpec(method='svmRadial',tuneGrid=expand.grid(.sigma=.1,.C=c(0.01,0.1,1,3,5,10,20))),
rf=caretModelSpec(method='rf'),
nn=caretModelSpec(method='nnet',tuneGrid=expand.grid(.size=(1:15),.decay=c(0.01,0.1,1,3,5,10,20)))
)
)
当我启动此代码时,一切似乎都运行正常。无论如何,经过一段时间后,model_list代码返回内存分配错误:
Resample01: C= 0.01
model fit failed for Resample01: C= 0.01 Error : cannot allocate vector of size 427.1 Gb
- Resample01: C= 0.01
+ Resample01: C= 0.10
model fit failed for Resample01: C= 0.10 Error : cannot allocate vector of size 427.1 Gb
有点&#39;我预料到这个错误。但是,我不知道如何解决它。我不知道如何减少线条(它们是用户的生理测量值)。
你能帮助我吗?
此致