Question

我正在为我拥有的数据集自举95％的置信区间，我们感兴趣的是测试两个因子水平之间响应变量（data.values）的平均差异是否显着不同于零（我分别实现的factorB或factorC-它们是部分嵌套的设计，但是我可以对三个现场站点（factorA）中的每一个对数据进行校正，以使在考虑factorB时，factorC不会感到困惑。就我而言，我对这两种处理方法（factorB）进行了4次重复，我想使用16种可能的配对组合中的每一种进行引导，每个组合中都有一个值（即factorB ==“ A中的第一个值） “减去factorB ==” B“内的四个值中的每个，factorB ==” A“内的第二个值-四个中的每个.....等等）。

我可以通过使用[]设置DF $ data.values来实现此目的，并且这种方法似乎可以按预期工作。但是，我想使用逻辑子集，因为当我要引导factorC之间的差异时，这将变得更加复杂，因此将来我将拥有更多的通用/可重用脚本。但是，当我使用逻辑子集定义statistic =函数时，最终会得到不同的结果（间隔更宽-最重要的根本原因是95％CI宽度的差异会影响我的结论）。我还注意到，逻辑子集方法使用的引导复制少于我最初请求的引导复制。 boot（）处理这两个看似相同的statistic =函数的原因不同吗？

以下是该问题的可复制示例：

require(boot)
set.seed(123)

#creating data layout example
mydata <- data.frame(
factorA= c(rep("A", times=8), rep("B", times=8), rep("C", times=8)),
factorB= rep(c(rep("A", times=4), rep("B", times=4)), times=3),
factorC=rep(c("A","B"),times=12),
data.values= (sample(-20:20, 24,replace=T))/10)

plots.A <- mydata[mydata$factorA=="A",]
#Subsetting by factorA since I will eventually need to do a seperate bootstrap for each factor level.
#I also added Factor C since I will eventually be using the differences between factor B and factor C for each level of factor A

#demonstrating that these subsetting methods are equivalent statements
plots.A$data.values[1:4]
plots.A$data.values[plots.A$factorB=="A"]
plots.A$data.values[5:8]
plots.A$data.values[plots.A$factorB=="B"]


#defining the same statistic= function for boot() with both subsetting methods
diff <- function(df, x) {
DF <- df[x,] 
return(mean(DF$data.values[1:4])-mean(DF$data.values[5:8]))
}

diff2 <- function(df, x) {
DF <- df[x,] 
return(mean(DF$data.values[DF$factorB=="A"])-mean(DF$data.values[DF$factorB=="B"]))
}
#given my data layout, I would prefer to use the latter approach given the 
#complexity of subsetting factor C with the [#:#] or c(#, #,...) notation

#comparing both boot.ci() outputs

#implement and view boot function
boot1<-boot(plots.A, diff, R=10000)
boot2<-boot(plots.A, diff2, R=10000)
boot1
boot2
#get and view 95% confidence intervals from bootstrap
ci<-boot.ci(boot1,type="basic")
ci2 <- boot.ci(boot2, type="basic")
ci
ci2

谢谢！（此外，我一直对SO感兴趣很久，但这是我的第一篇文章，因此对格式或需要编辑的任何评论均表示赞赏）

boot（）如何处理子集向量的不同方法？

0 个答案: