这很简单,下面的rfe函数会引发此错误:“ rfe.default(predictors,as.vector(outcomes),size = c(5),rfeControl = rfeControl(functions =“ lmFuncs”)中的错误): x和y中的样本数应相同”
df第一栏是一个因子w / 2级。其余df列为数字。没有NA值。
我丝毫不知道这个错误是关于什么的。我已经尝试了其他一些解决方案,但无济于事。
rfe_linear <- caret::rfe(
df[ , -1 ],
df[ , 1 ],
sizes = c( 5 ),
rfeControl = rfeControl(
functions = 'lmFuncs',
method = 'boot',
number = 20
)
)
我尝试了以下操作,但给出了相同的错误。
rfe_linear <- caret::rfe(
as.matrix( df[ , -1 ] ),
df[ , 1 ],
sizes = c( 5 ),
rfeControl = rfeControl(
functions = 'lmFuncs',
method = 'boot',
number = 20
)
)
尝试以下方法。引发此错误:“错误:必须在[
中使用向量,而不是类矩阵的对象。”
rfe_linear <- caret::rfe(
as.matrix( df[ , -1 ] ),
as.factor( df[ , 1 ] ),
sizes = c( 5 ),
rfeControl = rfeControl(
functions = 'lmFuncs',
method = 'boot',
number = 20
)
)
还尝试了以下操作,并引发以下错误:“错误:$运算符对原子向量无效”。叫我疯了,但是我在代码的任何地方都看不到$运算符。
rfe_linear <- caret::rfe(
df[ , -1 ],
df[[ 1 ]],
sizes = c( 5 ),
rfeControl = rfeControl(
functions = 'lmFuncs',
method = 'boot',
number = 20
)
)
我已经尝试了as.factor(),as.data.frame(),as.matrix(),df [,1],df [,-1],df [,2:ncol( df)]和人们可能想到的df [,1:1]。
所以我尝试了这个:
rfe_linear <- caret::rfe(
df[ , -1 ],
df$Phenotype,
sizes = c( 5 ),
rfeControl = rfeControl(
functions = 'lmFuncs',
method = 'boot',
number = 20
)
)
似乎运行了一段时间,但是,当然,R不会让我轻松地运行rfe(),因此在日志末尾它引发了另一个错误,并且rfe_linear对象是仍然找不到。
+(rfe) fit Resample01 size: 5191
-(rfe) fit Resample01 size: 5191
+(rfe) imp Resample01
-(rfe) imp Resample01
+(rfe) fit Resample02 size: 5191
-(rfe) fit Resample02 size: 5191
+(rfe) imp Resample02
-(rfe) imp Resample02
+(rfe) fit Resample03 size: 5191
-(rfe) fit Resample03 size: 5191
+(rfe) imp Resample03
-(rfe) imp Resample03
+(rfe) fit Resample04 size: 5191
-(rfe) fit Resample04 size: 5191
+(rfe) imp Resample04
-(rfe) imp Resample04
+(rfe) fit Resample05 size: 5191
-(rfe) fit Resample05 size: 5191
+(rfe) imp Resample05
-(rfe) imp Resample05
+(rfe) fit Resample06 size: 5191
-(rfe) fit Resample06 size: 5191
+(rfe) imp Resample06
-(rfe) imp Resample06
+(rfe) fit Resample07 size: 5191
-(rfe) fit Resample07 size: 5191
+(rfe) imp Resample07
-(rfe) imp Resample07
+(rfe) fit Resample08 size: 5191
-(rfe) fit Resample08 size: 5191
+(rfe) imp Resample08
-(rfe) imp Resample08
+(rfe) fit Resample09 size: 5191
-(rfe) fit Resample09 size: 5191
+(rfe) imp Resample09
-(rfe) imp Resample09
+(rfe) fit Resample10 size: 5191
-(rfe) fit Resample10 size: 5191
+(rfe) imp Resample10
-(rfe) imp Resample10
+(rfe) fit Resample11 size: 5191
-(rfe) fit Resample11 size: 5191
+(rfe) imp Resample11
-(rfe) imp Resample11
+(rfe) fit Resample12 size: 5191
-(rfe) fit Resample12 size: 5191
+(rfe) imp Resample12
-(rfe) imp Resample12
+(rfe) fit Resample13 size: 5191
-(rfe) fit Resample13 size: 5191
+(rfe) imp Resample13
-(rfe) imp Resample13
+(rfe) fit Resample14 size: 5191
-(rfe) fit Resample14 size: 5191
+(rfe) imp Resample14
-(rfe) imp Resample14
+(rfe) fit Resample15 size: 5191
-(rfe) fit Resample15 size: 5191
+(rfe) imp Resample15
-(rfe) imp Resample15
+(rfe) fit Resample16 size: 5191
-(rfe) fit Resample16 size: 5191
+(rfe) imp Resample16
-(rfe) imp Resample16
+(rfe) fit Resample17 size: 5191
-(rfe) fit Resample17 size: 5191
+(rfe) imp Resample17
-(rfe) imp Resample17
+(rfe) fit Resample18 size: 5191
-(rfe) fit Resample18 size: 5191
+(rfe) imp Resample18
-(rfe) imp Resample18
+(rfe) fit Resample19 size: 5191
-(rfe) fit Resample19 size: 5191
+(rfe) imp Resample19
-(rfe) imp Resample19
+(rfe) fit Resample20 size: 5191
-(rfe) fit Resample20 size: 5191
+(rfe) imp Resample20
-(rfe) imp Resample20
Error in { : task 1 failed - "replacement has 1 row, data has 0"
In addition: There were 50 or more warnings (use warnings() to see the first 50)
答案 0 :(得分:0)
lmFuncs
是一个因素时, y
不起作用。例如,rfFuncs
有用。
library(caret)
library(randomForest)
df <- data.frame(y=factor(rnorm(100)>0),
x1=rnorm(100), x2=rnorm(100), x3=rnorm(100), x4=rnorm(100), x5=rnorm(100),
x6=rnorm(100), x7=rnorm(100), x8=rnorm(100), x9=rnorm(100), x10=rnorm(100))
rfe_linear <- caret::rfe(
x = df[,-1],
y = df[,1],
sizes = 5,
rfeControl = rfeControl(
functions = rfFuncs,
method = 'boot',
number = 20
)
)
rfe_linear
#>
#> Recursive feature selection
#>
#> Outer resampling method: Bootstrapped (20 reps)
#>
#> Resampling performance over subset size:
#>
#> Variables Accuracy Kappa AccuracySD KappaSD Selected
#> 5 0.5398 0.09997 0.08876 0.1514
#> 10 0.5478 0.10734 0.07928 0.1433 *
#>
#> The top 5 variables (out of 10):
#> x8, x2, x4, x10, x9
由reprex package(v0.2.1)于2019-04-26创建