我试图用三个协变量(X,Y,Z)和两个断点来估计变量V的断点。
响应变量V = aX + bY + cZ + d
我模拟数据,其中(a,b,c,d)有3组值为(0.6,0.2,0.8,0.15),(1.6,1.2,1.8,1.15)和(3,5,4, 2.5)
我使用分段包估计系数但得到以下错误:
Error in segmented.lm(linearFit, seg.Z = ~X + Y + Z, psi = list(X = c(NA), :
Bootstrap restart only with a fixed number of breakpoints
这是我的代码,包含数据
#trapezoidal data
ref=c(rep(1,100),seq(1,10,0.05),rep(10,150),seq(10,0,-0.05),rep(0,200))
#covariates
xx=cumsum(ref)
yy=diff(xx)
zz=diff(yy)
#equalizing lengths of above vectors
vecL=length(zz)
xx=xx[1:vecL]
yy=yy[1:vecL]
zz=zz[1:vecL]
#adding noise to covariates
set.seed(10)
X=xx + max(xx)/100*rnorm(vecL)
Y=yy + max(yy)/100*rnorm(vecL)
Z=zz + max(zz)/100*rnorm(vecL)
#three segment response variable, total 830 points
V[1:200] = 0.6 *X[1:200]+ 0.2 *Y[1:200]+ 0.8 *Z[1:200]+ 0.15 + 0.01*rnorm(200)
V[201:400] = 1.6 *X[201:400]+ 1.2 *Y[201:400]+ 1.8 *Z[201:400]+ 1.15 + 0.01*rnorm(200)
V[401:830] = 3.0 *X[401:830]+ 5.0 *Y[401:830]+ 4.0 *Z[401:830]+ 2.50 + 0.01*rnorm(430)
##linear model
linearFit=lm(formula=V~X+Y+Z)
summary(linearFit)
##segmented
segFit=segmented(linearFit,seg.Z=~X+Y+Z,psi=list(X=c(NA),Y=c(NA),Z=c(NA)),control=seg.control(display=TRUE, K=4, stop.if.error=FALSE))
这是输出:
segFit=segmented(linearFit,seg.Z=~X+Y+Z,psi=list(X=c(NA),Y=c(NA),Z=c(NA)),control=seg.control(display=TRUE, K=4, stop.if.error=FALSE))
Error in segmented.lm(linearFit, seg.Z = ~X + Y + Z, psi = list(X = c(NA), :
Bootstrap restart only with a fixed number of breakpoints
我是否正确设置psi和控制?任何帮助表示赞赏。
答案 0 :(得分:0)
自动断点检测似乎非常具有实验性,文档指出了这一点。提供有限数量的起始值会更好。但无论如何,我可以让拟合函数开始像这样运行:
segFit=segmented(linearFit,seg.Z=~X+Y+Z,psi=list(X=c(NA),Y=c(NA),Z=c(NA)),
control=seg.control(display=TRUE, K=4, stop.if.error=FALSE, n.boot=0, it.max=50))
#0 287035116.259 (No breakpoint(s))
#1 52847700.113 12
#2 66421579.610 7
#3 60143023.830 7
#4 55936266.042 7
#5 45478319.984 5
#6 37237514.620 5
#7 34058342.767 5
#8 33889551.970 3
#9 33679837.419 3
#10 33680392.183 3
#Error in eval(expr, envir, enclos) : object 'U1.Y' not found
它对我们造成了错误。我的解释是Y
没有找到断点。因此,我从断点公式中删除了它:
segFit=segmented(linearFit,seg.Z=~X+Z,psi=list(X=c(NA),Z=c(NA)),
control=seg.control(display=TRUE, K=4, stop.if.error=FALSE, n.boot=0, it.max=50))
#0 287035116.259 (No breakpoint(s))
#1 57518175.693 8
#2 75024714.551 4
#3 53678468.904 4
#4 42978477.989 4
#5 36762393.424 4
#6 34564133.079 4
#7 33672729.061 4
#8 33672705.918 4
#Error in eval(expr, envir, enclos) : object 'U1.Z' not found
它仍然不喜欢它。我们删除Z
:
segFit=segmented(linearFit,seg.Z=~X,psi=list(X=c(NA)),
control=seg.control(display=TRUE, K=4, stop.if.error=FALSE, n.boot=0, it.max=50))
#0 287035116.259 (No breakpoint(s))
#1 59188023.560 4
#2 84927431.755 3
#3 58905175.574 3
#4 46487759.098 3
#5 39114874.784 3
#6 34916433.946 3
#7 33986478.337 3
#8 33680464.097 3
#9 33680464.097 3
成功! (我不确定segmented
可以很好地处理一个点上几个变量的中断。)