如何找到线性回归模型的子集选择?

时间:2015-11-12 10:02:43

标签: linear-regression

我正在使用mtcars数据集并使用线性回归

data(mtcars)
fit<- lm(mpg ~.,mtcars);summary(fit)

当我使用lm拟合模型时,它会显示结果

Call:
lm(formula = mpg ~ ., data = mtcars)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.5087 -1.3584 -0.0948  0.7745  4.6251 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)  
(Intercept) 23.87913   20.06582   1.190   0.2525  
cyl6        -2.64870    3.04089  -0.871   0.3975  
cyl8        -0.33616    7.15954  -0.047   0.9632  
disp         0.03555    0.03190   1.114   0.2827  
hp          -0.07051    0.03943  -1.788   0.0939 .
drat         1.18283    2.48348   0.476   0.6407  
wt          -4.52978    2.53875  -1.784   0.0946 .
qsec         0.36784    0.93540   0.393   0.6997  
vs1          1.93085    2.87126   0.672   0.5115  
amManual     1.21212    3.21355   0.377   0.7113  
gear4        1.11435    3.79952   0.293   0.7733  
gear5        2.52840    3.73636   0.677   0.5089  
carb2       -0.97935    2.31797  -0.423   0.6787  
carb3        2.99964    4.29355   0.699   0.4955  
carb4        1.09142    4.44962   0.245   0.8096  
carb6        4.47757    6.38406   0.701   0.4938  
carb8        7.25041    8.36057   0.867   0.3995  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 2.833 on 15 degrees of freedom
Multiple R-squared:  0.8931,    Adjusted R-squared:  0.779 
F-statistic:  7.83 on 16 and 15 DF,  p-value: 0.000124

我发现没有一个变量在0.05显着水平上被标记为显着。

要找出重要的变量,我想做子集选择,找出最佳的可变对,作为具有响应变量mpg的预测变量。

1 个答案:

答案 0 :(得分:1)

regsubsets中的函数leaps执行最佳子集回归(请​​参阅?leaps)。调整代码:

    library(leaps)
    regfit <- regsubsets(mpg ~., data = mtcars)
    summary(regfit)
    # or for a more visual display
    plot(regfit,scale="Cp")