使用来自`caret`的`rfe`的逻辑回归器的特征选择错误

时间:2017-03-17 04:06:05

标签: r regression r-caret feature-selection

我正在使用包rfe中的caret进行功能选择,以进行线性回归。 我的一个回归量是一个逻辑变量,当我用这个变量进行特征选择时,我总是如此 得到Error in { : task 1 failed - "undefined columns selected"

如何使用rfe使用逻辑变量进行特征选择? 是否有必要将其转换为0,1?

的虚拟变量

以下是一个可重现的例子:

library(caret)
x <- mtcars[-1]
y <- mtcars$mpg

set.seed(2017)
ctrl <- rfeControl(functions = lmFuncs,
                   method = "repeatedcv",
                   repeats = 5,
                   verbose = FALSE)

lmProfile1 <- rfe(x, y, sizes = 1:5, rfeControl = ctrl)

# > lmProfile1
#
# Recursive feature selection
#
# Outer resampling method: Cross-Validated (10 fold, repeated 5 times)
#
# Resampling performance over subset size:
#
#  Variables  RMSE Rsquared RMSESD RsquaredSD Selected
#          1 3.503   0.8338  1.627     0.2393
#          2 3.197   0.8841  1.347     0.1783
#          3 3.214   0.8788  1.327     0.1815
#          4 3.050   0.8861  1.341     0.1603        *
#          5 3.063   0.8842  1.254     0.1670
#         10 3.332   0.8638  1.404     0.1926
#
# The top 4 variables (out of 4):
#    wt, am, qsec, hp

# am is one of the best features, now I turn it into a logic variable
x <- mtcars[-1]
x$am <- x$am == 1
y <- mtcars$mpg

set.seed(2017)
ctrl <- rfeControl(functions = lmFuncs,
                   method = "repeatedcv",
                   repeats = 5,
                   verbose = FALSE)

lmProfile2 <- rfe(x, y, sizes = 1:5, rfeControl = ctrl)

# Error in { : task 1 failed - "undefined columns selected"

# > packageVersion('caret')
# [1] ‘6.0.73’

0 个答案:

没有答案