我首先在R中安装了Poisson glm,如下所示:
> Y<-c(13,21,12,11,16,9,7,5,8,8)
> X<-c(74,81,80,79,89,96,69,88,53,72)
> age<-c(50.45194,54.89382,46.52569,44.84934,53.25541,60.16029,50.33870,
+ 51.44643,38.20279,59.76469)
> dat=data.frame(Y=Y,off.set.term=log(X),age=age)
> fit.1=glm(Y~age+offset(off.set.term),data=dat,family=poisson)
接下来,我尝试使用predict
函数预测新数据集的响应(在对数比例上)。请注意,我将偏移项设置为零。
> newdat=data.frame(age=c(52.09374,50.89329,50.61472,39.13358,44.79453),off.set.term=rep(0,5))
> predict(fit.1,newdata =newdat,type="link")
1 2 3 4 5
-1.964381 -1.956234 -1.954343 -1.876416 -1.914839
接下来,我在R中尝试了包segmented(版本0.3-0.0)并按如下方式安装了分段glm。 (使用预测函数时,最新版本的分段软件包(即0.3-1.0)似乎不支持偏移项。)
> library(segmented)
> fit.2=segmented(fit.1,seg.Z=~age,psi=list(age=mean(age)),
+ offs=off.set.term,data=newdat)
然后我使用fit.2
的预测函数来获得预测值:
> library(segmented)
> fit.2=segmented(fit.1,seg.Z=~age,psi=list(age=mean(age)),offs=off.set.term,data=newdat)
>
> predict(fit.2,newdata =newdat,type="link")
1 2 3 4 5
-26.62968 -26.08611 -25.95997 -20.76125 -23.32456
这些预测值与我使用fit.1
获得的值明显不同。
问题似乎是在偏移项中,因为当我们拟合没有偏移项的模型时,结果是合理的并且彼此接近如下:
> fit.3=glm(Y~age,data=dat,family=poisson)
> newdat.2=data.frame(age=c(52.09374,50.89329,50.61472,39.13358,44.79453))
> predict(fit.3,newdata =newdat.2,type="link")
1 2 3 4 5
2.406016 2.395531 2.393098 2.292816 2.342261
> fit.4=segmented(fit.3,seg.Z=~age,psi=list(age=mean(age)),data=newdat.2)
> predict(fit.4,newdata =newdat.2,type="link")
1 2 3 4 5
2.577669 2.524503 2.512165 2.003679 2.254396
答案 0 :(得分:1)
由于我从分段软件包维护者那里得到答案,我决定在这里分享它。首先,通过
将软件包更新到0.3-1.0版本install.packages("segmented",type="source")
更新后,运行相同的命令会导致:
> Y<-c(13,21,12,11,16,9,7,5,8,8)
> X<-c(74,81,80,79,89,96,69,88,53,72)
> age<-c(50.45194,54.89382,46.52569,44.84934,53.25541,60.16029,50.33870,
+ 51.44643,38.20279,59.76469)
> dat=data.frame(Y=Y,off.set.term=log(X),age=age)
> fit.1=glm(Y~age+offset(off.set.term),data=dat,family=poisson)
>
> newdat=data.frame(age=c(52.09374,50.89329,50.61472,39.13358,44.79453),off.set.term=rep(0,5))
> predict(fit.1,newdata =newdat,type="link")
1 2 3 4 5
-1.964381 -1.956234 -1.954343 -1.876416 -1.914839
>
> library(segmented)
> fit.2=segmented(fit.1,seg.Z=~age,psi=list(age=mean(age)),offs=off.set.term,data=newdat)
> predict(fit.2,newdata =newdat,type="link")
Error in offset(off.set.term) : object 'off.set.term' not found
因此无法找到偏移项。现在的诀窍(目前)是首先附加newdat
,然后预测如下:
> attach(newdat)
The following object is masked _by_ .GlobalEnv:
age
> predict(fit.2,newdata =newdat,type="link")
1 2 3 4 5
-1.825831 -1.853842 -1.860342 -2.128237 -1.996147
现在结果确实有意义。干杯!