R - DiD(DIfference的差异)模型

时间:2017-01-14 10:32:59

标签: r statistics dummy-variable

我试图用R设置DiD模型。我有基线阶段和治疗组。我试图在模型中考虑基线和年龄通量。所以我创建了两个虚拟变量。

young <- Shower_data$Age %in% c("20-29", "30-39")
old <- Shower_data$Age %in% c("40-49", "50-64", "65+")

Shower_data$young_pos <- ifelse(Shower_data$young>0, 1, 0)
Shower_data$young_neg <- ifelse(Shower_data$young<=0, 1, 0)

Shower_data$young_pos <- 1
Shower_data$young_pos [ old ] <- 0
Shower_data$young_neg <- 0
Shower_data$young_neg [ old ] <- 1

#Create a model that considers the baseline and age
model4 <- lm(Volume ~ (Shower + dummy_phase * dummy_exp_group) * (young_pos + young_neg), data = Shower_data)
summary(model4)

没有年龄,一切都按预期进行,但是当我添加年龄时,我只获得young_pos变量的结果,但不会获得young_neg的结果。正如你在这里看到的那样:

#Coefficients: (5 not defined because of singularities)
#                                               Estimate Std. Error t value Pr(>|t|)    
#(Intercept)                                    45.07006    1.65499  27.233  < 2e-16 ***
#Shower                                         -0.04725    0.00819  -5.769 8.09e-09 ***
#dummy_phase                                    -5.35647    1.72401  -3.107  0.00189 ** 
#dummy_exp_grouptreatment                       -9.33660    1.95433  -4.777 1.79e-06 ***
#young_pos                                       8.11264    2.78459   2.913  0.00358 ** 
#young_neg                                            NA         NA      NA       NA    
#dummy_phase:dummy_exp_grouptreatment            6.23700    2.06968   3.014  0.00259 ** 
#Shower:young_pos                                0.07690    0.01361   5.652 1.61e-08 ***
#Shower:young_neg                                     NA         NA      NA       NA    
#dummy_phase:young_pos                          -1.38223    2.87629  -0.481  0.63084    
#dummy_phase:young_neg                                NA         NA      NA       NA    
#dummy_exp_grouptreatment:young_pos              1.94658    3.19773   0.609  0.54271    
#dummy_exp_grouptreatment:young_neg                   NA         NA      NA       NA    
#dummy_phase:dummy_exp_grouptreatment:young_pos  2.56634    3.39298   0.756  0.44944    
#dummy_phase:dummy_exp_grouptreatment:young_neg       NA         NA      NA       NA  

正如您在此变量中只能看到NA一样。感谢

1 个答案:

答案 0 :(得分:1)

看看这部分代码:

Shower_data$young_pos <- 1
Shower_data$young_pos [ old ] <- 0
Shower_data$young_neg <- 0
Shower_data$young_neg [ old ] <- 1

向量young_pos包含01 s。向量young_neg也包含01 s。但是,一个向量与另一个向量完全相反。因此,两个向量都编码相同的信息,模型只能估计其中一个的效果。