我想使用插入符来构建一个由10倍交叉验证结果估算的线性回归模型。
fitControl <- trainControl(## 10-fold CV
method = "repeatedcv",
number = 10,
## repeated ten times
repeats = 10,
savePredictions=T)
Fit1 <- train(X_B,Y_B,
method = "glm",
trControl = fitControl)
> Fit1
Generalized Linear Model
23 samples
4 predictor
No pre-processing
Resampling: Cross-Validated (10 fold, repeated 10 times)
Summary of sample sizes: 21, 20, 21, 21, 21, 21, ...
Resampling results
RMSE Rsquared RMSE SD Rsquared SD
0.1521155 0.8742447 0.07348565 0.2732692
我似乎得到了一个很好的预测结果。但与观察相比,
> cor(Fit1$finalModel$fitted.values,Y_B)
[1] 0.6307828
结果与验证结果非常不同。 我非常感谢你的帮助,谢谢你。
这是我使用的数据 X_B
82 67.5 89 540
82 79.4 33 33
82 66.6 43 231
66.6 82 55 51
82 66.6 116 231
66.6 53 55 151
67.5 66.2 28 28
82 82 120 116
82 67.5 53 203
66.6 82 36 32
82 66.6 235 229
66.6 82 24 23
82 82 130 381
82 66.6 38 245
82 47.3 70 62
82 66.6 132 262
68.4 82 25 24
82 67.5 103 244
65.6 82 34 28
82 66.6 73 225
67.5 53 55 54
82 82 213 287
66.6 82 65 61
Y_B
1.18650088809947
1.07726763717805
0.703157894736842
1.05601659751037
1.08866442199776
0.955510616784631
0.77390180878553
1.00677200902935
0.870726495726496
0.730769230769231
0.804239401496259
0.897186147186147
1.3880764904387
0.861434108527132
0.755862068965517
0.996685082872928
0.888789237668161
0.894220283533261
0.931395348837209
0.97422126745435
0.84297520661157
0.995975855130785
1.23547717842324
答案 0 :(得分:1)
在您的情况下,交叉验证的数量接近于行数。每次交叉验证中的训练样例量都很小,以至于这些交叉验证的预测值会降低,因此准确度会降低。