我正在尝试手动计算多元线性回归的预测区间,并注意到与predict.lm()
输出和R中的手动计算存在一些差异。
这是我的代码:
library(MASS)
data("Boston") # using the boston dataset
df <- Boston
model <- lm(medv ~ crim + nox + age + dis + tax + black + 0, data = df) # using select variables rather than all variables
n = 3 # set observation number
(pred_val <- predict.lm(object = model, newdata = df[n,], interval = "prediction", level = 0.95))
predict.lm()
的输出:
fit lwr upr
24.91563 8.025451 41.8058
现在我尝试手工计算:
dof <- nrow(df) - 6 # calculate degrees of freedom, subtracting 6 predictor variables (no intercept term)
t_crit <- qt(p = c(.025, .975), df = dof) # calculate t-critical values for 95% confidence
se <- sigma(model) # get standard error of regression
pred_val[1] + (t_crit * se) # add the errors to the predicted value
输出:
[1] 8.060998 41.770252
predict.lm()
函数的下限和我的手动计算很接近,但不是完全匹配!有什么东西我做错了吗?