我通过在对数对数转换后应用带有R函数lm
的简单线性模型来将幂模型拟合到数据集,如以下示例所示(例如,代替直接拟合幂模型)通过应用nls
函数)。
我可以使用函数predict.lm
将模型应用于新数据并计算预测间隔。
data(stackloss); dat <- stackloss[c(2, 4)]; colnames(dat) <- c("x","y")
dat.lm <- lm(log(y) ~ log(x), data = dat)
new <- data.frame(x = seq(0, 30, 1))
pred <- predict.lm(dat.lm, new, interval = "prediction", level = 0.95)
matplot(new$x, exp(pred), type = "l", col = 1, lty = c(1, 2, 2)); points(dat$x, dat$y)
现在,我需要对n个预测值求和(在应用“ exp”函数后,这很简单),还需要计算汇总的方差和预测间隔。
在以下问答中,为简单的线性模型描述了后者:linear model with `lm`: how to get prediction variance of sum of predicted values。
在这个有趣的答案中,为简单的线性模型引入了以下函数lm_predict
(该函数可以计算预测值的完整方差-协方差矩阵)和agg_pred
。
lm_predict <- function (lmObject, newdata, diag = TRUE) {
## input checking
if (!inherits(lmObject, "lm")) stop("'lmObject' is not a valid 'lm' object!")
## extract "terms" object from the fitted model, but delete response variable
tm <- delete.response(terms(lmObject))
## linear predictor matrix
Xp <- model.matrix(tm, newdata)
## predicted values by direct matrix-vector multiplication
pred <- c(Xp %*% coef(lmObject))
## efficiently form the complete variance-covariance matrix
QR <- lmObject$qr ## qr object of fitted model
piv <- QR$pivot ## pivoting index
r <- QR$rank ## model rank / numeric rank
if (is.unsorted(piv)) {
## pivoting has been done
B <- forwardsolve(t(QR$qr), t(Xp[, piv]), r)
} else {
## no pivoting is done
B <- forwardsolve(t(QR$qr), t(Xp), r)
}
## residual variance
sig2 <- c(crossprod(residuals(lmObject))) / df.residual(lmObject)
if (diag) {
## return point-wise prediction variance
VCOV <- colSums(B ^ 2) * sig2
} else {
## return full variance-covariance matrix of predicted values
VCOV <- crossprod(B) * sig2
}
list(fit = pred, var.fit = VCOV, df = lmObject$df.residual, residual.var = sig2)
}
agg_pred <- function (w, predObject, alpha = 0.95) {
## input checing
if (length(w) != length(predObject$fit)) stop("'w' has wrong length!")
if (!is.matrix(predObject$var.fit)) stop("'predObject' has no variance-covariance matrix!")
## mean of the aggregation
agg_mean <- c(crossprod(predObject$fit, w))
## variance of the aggregation
agg_variance <- c(crossprod(w, predObject$var.fit %*% w))
## adjusted variance-covariance matrix
VCOV_adj <- with(predObject, var.fit + diag(residual.var, nrow(var.fit)))
## adjusted variance of the aggregation
agg_variance_adj <- c(crossprod(w, VCOV_adj %*% w))
## t-distribution quantiles
Qt <- c(-1, 1) * qt((1 - alpha) / 2, predObject$df, lower.tail = FALSE)
## names of CI and PI
NAME <- c("lower", "upper")
## CI
CI <- setNames(agg_mean + Qt * sqrt(agg_variance), NAME)
## PI
PI <- setNames(agg_mean + Qt * sqrt(agg_variance_adj), NAME)
## return
list(mean = agg_mean, var = agg_variance, CI = CI, PI = PI)
}
但是,在对数-对数回归的情况下,这些不能直接应用于正确汇总的方差。也许我应该转换lm_predict
的输出中的方差,但是我不知道如何进行。
预先感谢您的帮助。