我正在构建bootstrap函数,它将估计每个自举数据集的逻辑回归并进行一些额外的计算。对于小样本,有时它可能会产生警告,例如
Warning messages:
1: glm.fit: algorithm did not converge
2: glm.fit: fitted probabilities numerically 0 or 1 occurred
我想为每个bootstrap迭代指出这样的警告。 这是我的代码:
library(parallel)
n=length(y);
dataN=as.matrix((cbind(y,data)));
cl <- makeCluster(detectCores())
clusterExport(cl,c("dataN"), envir=environment())
clusterExport(cl,c("n"), envir=environment())
clusterEvalQ(cl,library(glmnet))
q=2000;
repl1=parLapply(cl=cl,1:q,function(i,dataA=dataN,smpl=n,...){
dataB <- dataA[sample(nrow(dataA),size=smpl,replace=TRUE),]
p=dim(dataB)[2];p
forname=c(colnames(dataB)[2:p]);
formulaZS=as.formula(paste("y~",paste(forname,collapse="+"),sep=""));
assign("last.warning", NULL, envir = baseenv())
m=glm(formulaZS,family=binomial,data=data.frame(dataB))
coefs=summary(m)$coef[,1];coefs
a=length(attr(warnings(),"names"))
indicator=ifelse(a==0,0,1);
return(list(coefs=coefs,indicator=indicator))
})
stopCluster(cl)
beta=t(matrix(unlist(lapply(repl1, "[[", "coefs")),nrow=p+1));
beta.est=apply(beta,2,mean);beta.est
indicator=t(matrix(unlist(lapply(repl1, "[[", "indicator")),nrow=1));
sum(indicator)
在每次回归估算之前,我都要通过
重置警告assign("last.warning", NULL, envir = baseenv())
然后通过计算warnings()
名称的长度来检查是否有警告。如果我手动执行此操作,而不应用parLapply
,则此方法有效。但是,如果我正在运行此函数,即使有警告,它也只生成带有零的向量indicator
。
数据生成代码:
fundata=function(n,p,corr){
R = matrix(rep(corr,p*p),nrow=p)+(1-corr)*diag(p);
R <- round(((R * lower.tri(R)) + t(R * lower.tri(R))),2)
diag(R) <- 1
U = t(chol(R))
nvars = dim(U)[1]
random.normal = matrix(rnorm(nvars*n,mean=0,sd=1), nrow=nvars, ncol=n);
X = U %*% random.normal
newX = t(X)
data = as.data.frame(newX)
names(data)<-sprintf("V%d",1:p)
return(data=data)
}
n=50;
p=5;
corr=0.5;
seed=123;
#Generate independent variables
data=fundata(n,p,corr);
#Generate dependent variable
p=dim(data)[2];p
true=c(-1,0,0,0.2,0.675,-1.5)
z=as.matrix(cbind(rep(1,n),data[,1:p]))%*%true;
pr = 1/(1+exp(-z));
y = rbinom(n=n, size=1, prob=t(pr))
dataN=as.matrix((cbind(y,data)))