我正在运行car
版本2.1.4并尝试使用Anova
函数来获取基于Wald的p值,以便使用成功/失败设置的逻辑回归进行功效分析。如果我运行以下简单的阶乘,则由于0残余自由度而导致的误差函数但是样本量非常大。我在做什么或想到这个错误?
问题是glm()
调用,因为它同样表示零残差df
?
X <- matrix(c(100,66566,73,66593,1201,398799,165,66501),
nrow = 4,ncol = 2,byrow = TRUE)
x_df <- data.frame(premium = c(300,300,500,500),
restrict = c(500,2500,500,2500))
x_df$int <- x_df$premium * x_df$restrict
mod <- glm(X~premium+restrict+int,
data=x_df,family=binomial)
summary(mod)
car::Anova(mod,type="III","Wald")
ADD#1:
似乎成功/失败语法无法正常工作。当我手动将数据扩展到~600,000行时,拟合是相同的,但res.df是正确的:
X<-matrix(c(100,66566,73,66593,1201,398799,165,66501),nrow = 4,ncol = 2,byrow = TRUE)
x_df<-data.frame(premium=c(300,300,500,500),restrict=c(500,2500,500,2500))
x_df$int<-x_df$premium*x_df$restrict
mod<-glm(X~premium+restrict+premium*restrict, data=x_df,family=binomial)
summary(mod)
Anova(mod,type = "III",test.statistic = "Wald")
y=c(rep(1,100),rep(0,66566),rep(1,73),rep(0,66593),rep(1,1201),rep(0,398799),rep(1,165),rep(0,66501))
premium<-c(rep(300,66666*2),rep(500,1201+398799+165+66501))
restrict<-c(rep(500,66666),rep(2500,66666),rep(500,1201+398799),rep(2500,165+66501))
x<-data.frame(y=y,premium=premium,restrict=restrict)
mod2<-glm(y~premium+restrict+premium*restrict,data=x,family=binomial)
summary(mod2)
Anova(mod2,type = "III",test.statistic = "Wald")
答案 0 :(得分:0)
剩余自由度是(观察次数) - (参数个数)。你有四个观察和四个参数。我不确定还有什么要说的......