我想使用boot.ci()
计算多阶段引导程序的BCa置信区间。以下是一个示例:Non-parametric bootstrapping on the highest level of clustered data using boot() function from {boot} in R
它使用boot
命令。
# creating example df
rho <- 0.4
dat <- expand.grid(
trial=factor(1:5),
subject=factor(1:3)
)
sig <- rho * tcrossprod(model.matrix(~ 0 + subject, dat))
diag(sig) <- 1
set.seed(17); dat$value <- chol(sig) %*% rnorm(15, 0, 1)
# function for resampling
resamp.mean <- function(dat,
indices,
cluster = c('subject', 'trial'),
replace = TRUE){
cls <- sample(unique(dat[[cluster[1]]]), replace=replace)
sub <- lapply(cls, function(b) subset(dat, dat[[cluster[1]]]==b))
sub <- do.call(rbind, sub)
mean(sub$value)
}
dat.boot <- boot(dat, resamp.mean, 4) # produces and estimated statistic
boot.ci(data.boot) # produces errors
如何在boot.ci
输出上使用boot
?
答案 0 :(得分:0)
您使用的引导程序重采样太少了。当您致电boot.ci
时,需要影响度量,如果没有提供,则会从empinf
获取,这可能会因观察次数过少而失败。有关类似的解释,请参阅here。
尝试
dat.boot <- boot(dat, resamp.mean, 1000)
boot.ci(dat.boot, type = "bca")
给出:
> boot.ci(dat.boot, type = "bca")
BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
Based on 1000 bootstrap replicates
CALL :
boot.ci(boot.out = dat.boot, type = "bca")
Intervals :
Level BCa
95% (-0.2894, 1.2979 )
Calculations and Intervals on Original Scale
Some BCa intervals may be unstable
作为替代方案,您可以自己提供L
(影响力度量)。
# proof of concept, use appropriate value for L!
> dat.boot <- boot(dat, resamp.mean, 4)
> boot.ci(dat.boot, type = "bca", L = 0.2)
BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
Based on 4 bootstrap replicates
CALL :
boot.ci(boot.out = dat.boot, type = "bca", L = 0.2)
Intervals :
Level BCa
95% ( 0.1322, 1.2979 )
Calculations and Intervals on Original Scale
Warning : BCa Intervals used Extreme Quantiles
Some BCa intervals may be unstable