重要的二次项 - 线性回归 - R.

时间:2013-10-17 19:42:57

标签: r statistics linear-regression

我收到了以下数据:

enter image description here

我被告知首先适合二次模型。

enter image description here

 > time = c(10,20,15,11,11,19,11,13,17,18,16,16,17,18,10)
 > experience = c(24,1,10,15,17,3,20,9,3,1,7,9,7,5,20)
 > fit = lm (time ~ experience + I(experience^2))
> summary(fit)

Call:
lm(formula = y ~ x + I(x^2))

Residuals:
    Min      1Q  Median      3Q     Max 
-1.8287 -0.8300  0.5054  0.7476  1.1713 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) 20.091108   0.724705  27.723    3e-12 ***
x           -0.670522   0.154706  -4.334 0.000972 ***
I(x^2)       0.009535   0.006326   1.507 0.157605    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.091 on 12 degrees of freedom
Multiple R-squared:  0.9162,    Adjusted R-squared:  0.9022 
F-statistic: 65.59 on 2 and 12 DF,  p-value: 3.465e-07

那里的一切似乎都很好。

我的模特是

y = 20.091-.671x+.0095x^2

绘制它:

> x = seq(0,25, by = .1)
> y = fit$coefficient[1]+fit$coefficient[2]*x+fit$coefficient[3]*x^2
> lines(x,y)

enter image description here

同样,一切似乎都很好。

但后来我被告知测试二次项是否在a = .1显着性水平上是显着的。

所以我做了

> fit1 = lm (time ~ experience + I(experience^2))
> fit2 = lm(time~experience)
> anova(fit2, fit1)

Analysis of Variance Table

Model 1: time ~ experience
Model 2: time ~ experience + I(experience^2)
  Res.Df    RSS Df Sum of Sq      F Pr(>F)
1     13 16.984                           
2     12 14.280  1    2.7037 2.2719 0.1576

所以二次项的F值是2.27。对应于.1576的概率。 .1576> .1因此,二次项在a = .1

时是显着的

但是我的教授已经指出我们应该发现二次项对我们的模型来说是微不足道的。我在这里做错了什么?

1 个答案:

答案 0 :(得分:1)

你未能做的是构造正交多项式项。 R中的poly()函数是为此目的而设计的。

 time = c(10,20,15,11,11,19,11,13,17,18,16,16,17,18,10)
 experience = c(24,1,10,15,17,3,20,9,3,1,7,9,7,5,20)
 fit = lm (time ~ poly(experience, degree=2))
 summary(fit)
#--------------
Call:
lm(formula = time ~ poly(experience, degree = 2))

Residuals:
    Min      1Q  Median      3Q     Max 
-1.8287 -0.8300  0.5054  0.7476  1.1713 

Coefficients:
                              Estimate Std. Error t value Pr(>|t|)    
(Intercept)                    14.8000     0.2817  52.544 1.48e-15 ***
poly(experience, degree = 2)1 -12.3861     1.0909 -11.354 8.94e-08 ***
poly(experience, degree = 2)2   1.6443     1.0909   1.507    0.158    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.091 on 12 degrees of freedom
Multiple R-squared:  0.9162,    Adjusted R-squared:  0.9022 
F-statistic: 65.59 on 2 and 12 DF,  p-value: 3.465e-07

您的F统计量并非特定于二次项,但实际上是将空模型与两个术语模型进行比较。