我试图用R设置DiD模型。我有基线阶段和治疗组。我试图在模型中考虑基线和年龄通量。所以我创建了两个虚拟变量。
young <- Shower_data$Age %in% c("20-29", "30-39")
old <- Shower_data$Age %in% c("40-49", "50-64", "65+")
Shower_data$young_pos <- ifelse(Shower_data$young>0, 1, 0)
Shower_data$young_neg <- ifelse(Shower_data$young<=0, 1, 0)
Shower_data$young_pos <- 1
Shower_data$young_pos [ old ] <- 0
Shower_data$young_neg <- 0
Shower_data$young_neg [ old ] <- 1
#Create a model that considers the baseline and age
model4 <- lm(Volume ~ (Shower + dummy_phase * dummy_exp_group) * (young_pos + young_neg), data = Shower_data)
summary(model4)
没有年龄,一切都按预期进行,但是当我添加年龄时,我只获得young_pos
变量的结果,但不会获得young_neg
的结果。正如你在这里看到的那样:
#Coefficients: (5 not defined because of singularities)
# Estimate Std. Error t value Pr(>|t|)
#(Intercept) 45.07006 1.65499 27.233 < 2e-16 ***
#Shower -0.04725 0.00819 -5.769 8.09e-09 ***
#dummy_phase -5.35647 1.72401 -3.107 0.00189 **
#dummy_exp_grouptreatment -9.33660 1.95433 -4.777 1.79e-06 ***
#young_pos 8.11264 2.78459 2.913 0.00358 **
#young_neg NA NA NA NA
#dummy_phase:dummy_exp_grouptreatment 6.23700 2.06968 3.014 0.00259 **
#Shower:young_pos 0.07690 0.01361 5.652 1.61e-08 ***
#Shower:young_neg NA NA NA NA
#dummy_phase:young_pos -1.38223 2.87629 -0.481 0.63084
#dummy_phase:young_neg NA NA NA NA
#dummy_exp_grouptreatment:young_pos 1.94658 3.19773 0.609 0.54271
#dummy_exp_grouptreatment:young_neg NA NA NA NA
#dummy_phase:dummy_exp_grouptreatment:young_pos 2.56634 3.39298 0.756 0.44944
#dummy_phase:dummy_exp_grouptreatment:young_neg NA NA NA NA
正如您在此变量中只能看到NA
一样。感谢
答案 0 :(得分:1)
看看这部分代码:
Shower_data$young_pos <- 1
Shower_data$young_pos [ old ] <- 0
Shower_data$young_neg <- 0
Shower_data$young_neg [ old ] <- 1
向量young_pos
包含0
和1
s。向量young_neg
也包含0
和1
s。但是,一个向量与另一个向量完全相反。因此,两个向量都编码相同的信息,模型只能估计其中一个的效果。