使用nnet包构建模型时遇到问题。如果我理解正确,SIZE参数是隐藏层中的神经元数量。我使用size = 2或1,但这给了我不好的结果。我尝试3次以上,这给了我一个错误:
Error in (n1 + 1L):n2 : NA/NaN-agument
Warning
In nominalTrainWorkflow(dat = trainData, info = trainInfo, method = method, :
There were missing values in resampled performance measures.
net <- train(Y~., data=af,
method = 'nnet',
preProcess = NULL,
trControl = fitControl,
verbose = FALSE,
tuneGrid=data.frame(.size=2, .decay=5e-3),
MaxNWts=8000,
linout =T,
maxit=1000
)
那么我做错了什么,如何在没有错误的情况下增加size参数? 我有338个观测值和2830个变量的数据。
提前谢谢
答案 0 :(得分:3)
download.file(url= "https://cloud.mail.ru/public/6e188e2baa1f%2Fdata.RData",
destfile = "data.RData")
load("C:\\Users\\jmiller\\Downloads\\data.RData")
require(nnet)
require(caret)
set.seed(1)
set.seed(123)
seeds <- vector(mode = "list", length = 51)
for(i in 1:50) seeds[[i]] <- sample.int(1000, 22)
fitControl <- trainControl(method = "adaptive_cv",
repeats = 5,
verboseIter = TRUE,
seeds = seeds)
net <- train(Y~., data=af,
method = 'nnet',
preProcess = NULL,
trControl = fitControl,
verbose = FALSE,
tuneGrid=data.frame(.size=3, .decay=5e-3),
MaxNWts=4000,
linout =T,
maxit=1000,
na.action = "na.omit"
)
问题的根源是自由度。
您的数据对象af
具有以下维度:
> dim(af)
[1] 338 2830
并且您在对338的Y观测值中回归所有2,829
列(即2,830 - 1),由于自由度不足而无效。
您的模型失败并显示两条消息:
model fit failed...Error in nnet.default(x, y, w, ...) : too many (8494) weights
There were missing values in resampled performance measures.
和你一样,起初我专注于#2。我尝试将na.action = "na.omit"
)参数添加到caret
的{{1}}函数中,这是一个好主意但不能解决此问题。
然后我意识到第一个信息更重要。它告诉你同样的事情 - 基于解释变量数量的权重数量太高了。
所以我拿走你所拥有的并减少了回归数(使用的列数),现在它工作正常: X&lt; -af [,1:10]
train()
请注意,我选择了> net <- train(y=af$Y, x=X,
+ method = 'nnet',
+ preProcess = NULL,
+ trControl = fitControl,
+ verbose = FALSE,
+ tuneGrid=data.frame(.size=3, .decay=5e-3),
+ MaxNWts=4000,
+ linout =T,
+ maxit=1000,
+ na.action = "na.omit"
+ )
# weights: 37
initial value 11680.837531
iter 10 value 567.695784
iter 20 value 536.975033
iter 30 value 531.229395
iter 40 value 529.138440
iter 50 value 528.186846
iter 60 value 527.256175
iter 70 value 526.785917
iter 80 value 524.733350
iter 90 value 523.327685
iter 100 value 523.153812
iter 110 value 522.866651
iter 120 value 522.520324
iter 130 value 519.557709
iter 140 value 519.034003
iter 150 value 518.865745
iter 160 value 518.817728
iter 170 value 518.782720
iter 180 value 518.733853
iter 190 value 518.676881
iter 200 value 518.664181
iter 210 value 518.657738
iter 220 value 518.656348
iter 230 value 518.641653
iter 240 value 518.634891
iter 250 value 518.633536
iter 260 value 518.632945
iter 270 value 518.632559
iter 280 value 518.632273
final value 518.632207
converged
# weights: 37
initial value 17038.362932
iter 10 value 501.607084
iter 20 value 492.283099
iter 30 value 491.783153
iter 40 value 490.882435
iter 50 value 490.509487
iter 60 value 490.347638
iter 70 value 490.276362
iter 80 value 490.135871
iter 90 value 490.046862
iter 100 value 490.042117
iter 110 value 490.040552
iter 120 value 490.038263
iter 130 value 490.033484
final value 490.031813
converged
# weights: 37
initial value 14383.923273
iter 10 value 689.694268
iter 20 value 632.520128
iter 30 value 506.806825
iter 40 value 497.204789
iter 50 value 496.058566
iter 60 value 495.153170
iter 70 value 493.646931
iter 80 value 492.361818
iter 90 value 492.088474
iter 100 value 492.008913
iter 110 value 491.498147
iter 120 value 491.345332
iter 130 value 491.306452
iter 140 value 491.290815
iter 150 value 491.278200
iter 160 value 491.244056
iter 170 value 491.232009
iter 180 value 491.226179
iter 190 value 491.212958
iter 200 value 491.209201
iter 210 value 491.208694
final value 491.208581
converged
# weights: 37
initial value 11084.289612
iter 10 value 622.044592
iter 20 value 520.053760
iter 30 value 507.456841
iter 40 value 502.000020
iter 50 value 497.723332
iter 60 value 495.164695
iter 70 value 494.157722
iter 80 value 492.678614
iter 90 value 491.936464
iter 100 value 491.706778
iter 110 value 491.360232
iter 120 value 491.141080
iter 130 value 490.920068
iter 140 value 490.714486
iter 150 value 490.578641
iter 160 value 490.542580
iter 170 value 490.524909
iter 180 value 490.486804
iter 190 value 490.338614
iter 200 value 490.110047
iter 210 value 489.915521
iter 220 value 489.812929
iter 230 value 489.756234
iter 240 value 489.718024
iter 250 value 489.711636
iter 260 value 489.702732
iter 270 value 489.661668
iter 280 value 489.644434
iter 290 value 489.633946
iter 300 value 489.625405
iter 310 value 489.593120
iter 320 value 489.572627
iter 330 value 489.569740
iter 340 value 489.528552
iter 350 value 489.511607
iter 360 value 489.508993
final value 489.508289
converged
# weights: 37
initial value 16479.112645
iter 10 value 615.081555
iter 20 value 501.104433
iter 30 value 481.204462
iter 40 value 479.305649
iter 50 value 477.681071
iter 60 value 476.668767
iter 70 value 476.010448
iter 80 value 475.480687
iter 90 value 474.542118
iter 100 value 473.701240
iter 110 value 473.345255
iter 120 value 473.099798
iter 130 value 472.974764
iter 140 value 472.849499
iter 150 value 472.685356
iter 160 value 472.521386
iter 170 value 472.438417
iter 180 value 472.294666
iter 190 value 472.250269
iter 200 value 472.231414
iter 210 value 472.226611
iter 220 value 472.222344
iter 230 value 472.213415
iter 240 value 472.208953
iter 250 value 472.204120
iter 260 value 472.199380
iter 270 value 472.196929
iter 280 value 472.195935
iter 290 value 472.195052
iter 300 value 472.194005
iter 310 value 472.193293
final value 472.193152
converged
# weights: 37
initial value 16669.126242
iter 10 value 888.977047
iter 20 value 637.124939
iter 30 value 624.802902
iter 40 value 622.441515
iter 50 value 618.396731
iter 60 value 618.219093
iter 70 value 617.815129
iter 80 value 617.295605
iter 90 value 616.955292
iter 100 value 616.817402
iter 110 value 616.766023
iter 120 value 616.749874
iter 130 value 616.745140
iter 140 value 616.724481
iter 150 value 616.706785
iter 160 value 616.696147
iter 170 value 616.516945
iter 180 value 616.288825
iter 190 value 615.986669
iter 200 value 615.853741
iter 210 value 615.759414
iter 220 value 615.704034
iter 230 value 615.669519
iter 240 value 615.642135
iter 250 value 615.626347
iter 260 value 615.548658
iter 270 value 615.526684
iter 280 value 615.509073
iter 290 value 615.494325
iter 300 value 615.481876
iter 310 value 615.462706
iter 320 value 615.459488
iter 330 value 615.450263
iter 340 value 615.438322
iter 350 value 615.430339
iter 360 value 615.423634
iter 370 value 615.418235
iter 380 value 615.391565
iter 390 value 615.383094
iter 400 value 615.379039
iter 410 value 615.367349
iter 420 value 615.361426
iter 430 value 615.358651
final value 615.358194
converged
>
中任意使用的列。您可以(并且应该)根据其性能选择列,除非您有一些理论框架来指导您的选择。例如,您可以使用逐步或套索回归方法选择列,或者通过手动运行各种训练模型并将其性能与ROC曲线(x
中提供)进行比较。