使用公式(不包括截距)调用model.matrix()之后,考虑缺少水平的因素

时间:2019-04-22 20:35:53

标签: r

我正在尝试使用model.matrix()将因数扩展为一组虚拟变量。特别是,我不需要拦截项。但是,当我有多个因素时,其中一个因素会错过水平。例如,

df <- data.frame(cat1 = c("A", "A", "B"), n1 = c(3, 2, 1), cat2 = c("Y", "Y", "N"))
> model.matrix(~cat1 + n1, df)
  (Intercept) cat1B n1
1           1     0  3
2           1     0  2
3           1     1  1
attr(,"assign")
[1] 0 1 2
attr(,"contrasts")
attr(,"contrasts")$cat1
[1] "contr.treatment"

> model.matrix(~cat1 + n1 + 0, df)
  cat1A cat1B n1
1     1     0  3
2     1     0  2
3     0     1  1
attr(,"assign")
[1] 1 1 2
attr(,"contrasts")
attr(,"contrasts")$cat1
[1] "contr.treatment"

但是以下结果使我感到困惑。

> model.matrix(~cat1 + n1 + cat2 + 0, df)
  cat1A cat1B n1 cat2Y
1     1     0  3     1
2     1     0  2     1
3     0     1  1     0
attr(,"assign")
[1] 1 1 2 3
attr(,"contrasts")
attr(,"contrasts")$cat1
[1] "contr.treatment"

attr(,"contrasts")$cat2
[1] "contr.treatment"

为什么没有列cat2N

0 个答案:

没有答案