R-Caret回归问题中的“错误的分类模型类型”

时间:2017-03-25 16:24:59

标签: r r-caret

我正在尝试使用R中Caret包中的各种预测算法来解决回归问题,即我的目标变量是连续的。 Caret认为分类是问题的合适类别,当我通过任何回归模型时,我收到一条错误消息,上面写着“错误的分类模型类型”。为了重现性,让我们看看Combined Cycle Power Plant Data Set。数据在CCPP.zip中。让我们将功率预测为其他变量的函数。权力是一个连续变量。

  library(readxl)
  library(caret)
  power_plant = read_excel("Folds5x2_pp.xlsx")
  apply(power_plant,2, class)   # shows all columns are numeric

  control <- trainControl(method="repeatedcv", number=10, repeats=5)

  my_glm <- train(power_plant[,1:4], power_plant[,5],
           method = "lm",
           preProc = c("center", "scale"),
            trControl = control)

下面的图片是我的截图:

enter image description here

2 个答案:

答案 0 :(得分:2)

由于某些原因,caret被tibbles弄糊涂了,这是read_excel返回的数据帧的tidyverse变体。通过将其转换为简单的数据框,然后将其转换为插入符号,一切正常:

library(readxl)
library(caret)
power_plant = read_excel("Folds5x2_pp.xlsx")
apply(power_plant,2, class)   # shows all columns are numeric

power_plant <- data.frame(power_plant)
control <- trainControl(method="repeatedcv", number=10, repeats=5)

my_glm <- train(power_plant[,1:4], power_plant[,5],
                method = "lm",
                preProc = c("center", "scale"),
                trControl = control)

my_glm

得到以下特性:

Linear Regression 

9568 samples
   4 predictor

Pre-processing: centered (4), scaled (4) 
Resampling: Cross-Validated (10 fold, repeated 5 times) 
Summary of sample sizes: 8612, 8612, 8611, 8612, 8612, 8610, ... 
Resampling results:

  RMSE      Rsquared 
  4.556703  0.9287933

Tuning parameter 'intercept' was held constant at a value of TRUE

答案 1 :(得分:0)

当我尝试使用公式 = y ~ x 时,我遇到了类似的错误,只需省略命名变量并使用 y ~ x 即可。