我收到了以下数据:
我被告知首先适合二次模型。
> time = c(10,20,15,11,11,19,11,13,17,18,16,16,17,18,10)
> experience = c(24,1,10,15,17,3,20,9,3,1,7,9,7,5,20)
> fit = lm (time ~ experience + I(experience^2))
> summary(fit)
Call:
lm(formula = y ~ x + I(x^2))
Residuals:
Min 1Q Median 3Q Max
-1.8287 -0.8300 0.5054 0.7476 1.1713
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 20.091108 0.724705 27.723 3e-12 ***
x -0.670522 0.154706 -4.334 0.000972 ***
I(x^2) 0.009535 0.006326 1.507 0.157605
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 1.091 on 12 degrees of freedom
Multiple R-squared: 0.9162, Adjusted R-squared: 0.9022
F-statistic: 65.59 on 2 and 12 DF, p-value: 3.465e-07
那里的一切似乎都很好。
我的模特是
y = 20.091-.671x+.0095x^2
绘制它:
> x = seq(0,25, by = .1)
> y = fit$coefficient[1]+fit$coefficient[2]*x+fit$coefficient[3]*x^2
> lines(x,y)
同样,一切似乎都很好。
但后来我被告知测试二次项是否在a = .1显着性水平上是显着的。
所以我做了
> fit1 = lm (time ~ experience + I(experience^2))
> fit2 = lm(time~experience)
> anova(fit2, fit1)
Analysis of Variance Table
Model 1: time ~ experience
Model 2: time ~ experience + I(experience^2)
Res.Df RSS Df Sum of Sq F Pr(>F)
1 13 16.984
2 12 14.280 1 2.7037 2.2719 0.1576
所以二次项的F值是2.27。对应于.1576的概率。 .1576> .1因此,二次项在a = .1
时是显着的但是我的教授已经指出我们应该发现二次项对我们的模型来说是微不足道的。我在这里做错了什么?
答案 0 :(得分:1)
你未能做的是构造正交多项式项。 R中的poly()函数是为此目的而设计的。
time = c(10,20,15,11,11,19,11,13,17,18,16,16,17,18,10)
experience = c(24,1,10,15,17,3,20,9,3,1,7,9,7,5,20)
fit = lm (time ~ poly(experience, degree=2))
summary(fit)
#--------------
Call:
lm(formula = time ~ poly(experience, degree = 2))
Residuals:
Min 1Q Median 3Q Max
-1.8287 -0.8300 0.5054 0.7476 1.1713
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 14.8000 0.2817 52.544 1.48e-15 ***
poly(experience, degree = 2)1 -12.3861 1.0909 -11.354 8.94e-08 ***
poly(experience, degree = 2)2 1.6443 1.0909 1.507 0.158
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 1.091 on 12 degrees of freedom
Multiple R-squared: 0.9162, Adjusted R-squared: 0.9022
F-statistic: 65.59 on 2 and 12 DF, p-value: 3.465e-07
您的F统计量并非特定于二次项,但实际上是将空模型与两个术语模型进行比较。