R:从匹配包中获取匹配的数据集 - 不是那么容易

时间:2014-12-29 20:48:04

标签: r statistics matching



> require(Matching)
> data(lalonde)
> # Estimate the propensity model
> glm1  <- glm(treat~age + I(age^2) + educ + I(educ^2) + black +
+                   hisp + married + nodegr + re74  + I(re74^2) + re75 + I(re75^2) +
+                   u74 + u75, family=binomial, data=lalonde)
> #save data objects
> X  <- glm1$fitted
> Y  <- lalonde$re78
> Tr  <- lalonde$treat
> # one-to-two matching with replacement
> rr  <- Match(Y=NULL, Tr=Tr, X=X, M=2, ties=F, caliper=0.01);
> summary(rr)

Estimate...  0 
SE.........  0 
T-stat.....  NaN 
p.val......  NA 

Original number of observations..............  445 
Original number of treated obs...............  185 
Matched number of observations...............  97 
Matched number of observations  (unweighted).  194 

Caliper (SDs)........................................   0.01 
Number of obs dropped by 'exact' or 'caliper'  88 

> #Obtain the matched data set
> matched <- rbind(lalonde[rr$index.treated,], lalonde[rr$index.control,])
> nrow(matched)
[1] 388



1 个答案:

答案 0 :(得分:1)

如果您注意到index.treated中的值重复M次,对于那些可以在{{caliper内找到匹配项的处理案例,这实际上非常简单。 1}}距离。


dfTC = data.frame(idxTreated = rr$index.treated, idxControl = rr$index.control,
                  numControl = factor(rep(1:2), labels = paste0("Control", 1:2)))
dfTCWide = reshape2::dcast(dfTC, idxTreated ~ numControl,
                           value.var = "idxControl")


> head(dfTCWide)
  idxTreated Control1 Control2
1          1      271      386
2          3      216      259
3          4      254      359
4          5      230      255
5          6      188      220
6          8      242      279