惩罚回归:“岭” RMSE大于“ lm” RMS

时间:2019-02-13 12:22:49

标签: r machine-learning r-caret

使用“ ElemStatLearn”包中的“前列腺”数据集。


set.seed(3434)

fit.lm = train(data=trainset, lpsa~., method = "lm")
fit.ridge = train(data=trainset, lpsa~., method = "ridge")
fit.lasso = train(data=trainset, lpsa~., method = "lasso")

RMSE的比较(针对脊和套索情况下的bestTune)

fit.lm$results[,"RMSE"]
[1] 0.7895572

fit.ridge$results[fit.ridge$results[,"lambda"]==fit.ridge$bestTune$lambda,"RMSE"]
[1] 0.8231873

fit.lasso$results[fit.lasso$results[,"fraction"]==fit.lasso$bestTune$fraction,"RMSE"]
[1] 0.7779534

比较系数的绝对值

abs(round(fit.lm$finalModel$coefficients,2))
(Intercept)   lcavol  lweight  age    lbph    svi     lcp     gleason    pgg45
       0.43   0.58    0.61     0.02   0.14    0.74    0.21    0.03       0.01 

abs(round(predict(fit.ridge$finalModel, type = "coef", mode = "norm")$coefficients[8,],2))
              lcavol    lweight   age    lbph    svi    lcp    gleason    pgg45
              0.49      0.62      0.01   0.14    0.65   0.05   0.00       0.01

abs(round(predict(fit.lasso$finalModel, type = "coef", mode = "norm")$coefficients[8,],2))
              lcavol   lweight   age    lbph    svi    lcp    gleason   pgg45
              0.56     0.61      0.02   0.14    0.72   0.18   0.00      0.01 

我的问题是:“脊” RMSE如何比普通“ lm”更高。那不是违背惩罚回归与单纯的“ lm”的目的吗?

此外,“ lweight”系数的绝对值实际上如何在ridge(0.62)中高于lm(0.61)?没有abs()时,两个系数本来都是正的。

我期望ridge的性能与套索相似,这不仅降低了RMSE,而且相对于普通“ lm”,系数的大小也缩小了。

谢谢!

0 个答案:

没有答案