Question

使用“ ElemStatLearn”包中的“前列腺”数据集。


set.seed(3434)

fit.lm = train(data=trainset, lpsa~., method = "lm")
fit.ridge = train(data=trainset, lpsa~., method = "ridge")
fit.lasso = train(data=trainset, lpsa~., method = "lasso")

RMSE的比较（针对脊和套索情况下的bestTune）

fit.lm$results[,"RMSE"]
[1] 0.7895572

fit.ridge$results[fit.ridge$results[,"lambda"]==fit.ridge$bestTune$lambda,"RMSE"]
[1] 0.8231873

fit.lasso$results[fit.lasso$results[,"fraction"]==fit.lasso$bestTune$fraction,"RMSE"]
[1] 0.7779534

比较系数的绝对值

abs(round(fit.lm$finalModel$coefficients,2))
(Intercept)   lcavol  lweight  age    lbph    svi     lcp     gleason    pgg45
       0.43   0.58    0.61     0.02   0.14    0.74    0.21    0.03       0.01 

abs(round(predict(fit.ridge$finalModel, type = "coef", mode = "norm")$coefficients[8,],2))
              lcavol    lweight   age    lbph    svi    lcp    gleason    pgg45
              0.49      0.62      0.01   0.14    0.65   0.05   0.00       0.01

abs(round(predict(fit.lasso$finalModel, type = "coef", mode = "norm")$coefficients[8,],2))
              lcavol   lweight   age    lbph    svi    lcp    gleason   pgg45
              0.56     0.61      0.02   0.14    0.72   0.18   0.00      0.01

我的问题是：“脊” RMSE如何比普通“ lm”更高。那不是违背惩罚回归与单纯的“ lm”的目的吗？

此外，“ lweight”系数的绝对值实际上如何在ridge（0.62）中高于lm（0.61）？没有abs（）时，两个系数本来都是正的。

我期望ridge的性能与套索相似，这不仅降低了RMSE，而且相对于普通“ lm”，系数的大小也缩小了。

谢谢！

惩罚回归：“岭” RMSE大于“ lm” RMS

0 个答案: