仅使某些功能为虚拟变量

时间:2019-02-09 01:01:43

标签: r r-caret

my_diamonds <- diamonds %>% mutate(cut = as.character(cut),
                                   color = as.character(color),
                                   clarity = as.character(clarity))

我想创建一个仅带有剪切和颜色的新数据框,作为dummyVars。

但是,我无法使下面的代码中的第一个块起作用:

# make cut and color dummar vars
dummy <- caret::dummyVars("cut + color",
                            data = my_diamonds, fullRank = F, sep = ".")

# now create the dummy vars as new dataframe training data
training_data <- predict(dummy, my_diamonds) %>% as.data.frame()

这件作品:

# make cut and color dummar vars
dummy <- caret::dummyVars("cut + color",
                            data = my_diamonds, fullRank = F, sep = ".")

给予: eval(parse(text = x,keep.source = FALSE)[[1L]])错误:   找不到对象“颜色”。

也尝试过:

dummy <- caret::dummyVars(~ "cut + color",
                            data = my_diamonds, fullRank = F, sep = ".")

哪个给: 术语中的错误.formula(formula,data = data):   ExtractVars中的模型公式无效

我如何基于my_diamonds创建一个新的数据框,其中cut和color是虚拟var?

1 个答案:

答案 0 :(得分:1)

一个小问题:~ "cut + color"应该是"~ cut + color"或仅仅是~ cut + color

dummy <- caret::dummyVars(~ cut + color,
                          data = my_diamonds, fullRank = FALSE, sep = ".")
training_data <- predict(dummy, my_diamonds) %>% as.data.frame()
head(training_data)
#   cutFair cutGood cutIdeal cutPremium cutVery Good colorD colorE colorF colorG colorH colorI colorJ
# 1       0       0        1          0            0      0      1      0      0      0      0      0
# 2       0       0        0          1            0      0      1      0      0      0      0      0
# 3       0       1        0          0            0      0      1      0      0      0      0      0
# 4       0       0        0          1            0      0      0      0      0      0      1      0
# 5       0       1        0          0            0      0      0      0      0      0      0      1
# 6       0       0        0          0            1      0      0      0      0      0      0      1