使用配方对象进行Forecast.train与预测

时间:2019-03-11 03:11:52

标签: r r-caret predict r-recipes

指定了在插入符号::: train中使用的配方后,我试图预测新的样本。关于此问题,我有几个问题,因为我在插入符号/食谱文档中找不到。

  1. 我应该使用predict()还是predict.train()?有什么区别?
  2. 在使用预测之前,我应该先用准备好的配方烘烤测试数据吗?当直接在train()中使用preProcess时,建议您不要对新数据进行预处理,因为train对象会自动执行此操作。使用食谱时是否一样?

下面是一个可重现的示例,说明了我的过程以及使用“预测”与“预测”火车时的预测差异

library(recipes)
library(caret)
# Data ----
data("credit_data")

credit_train <- credit_data[1:3500,]
credit_test <- credit_data[-(1:3500),]

# Set up recipe ----

set.seed(0)
Rec.Obj = recipe(Status ~ ., data = credit_train) %>%
    step_knnimpute(all_predictors()) %>% 
    step_center(all_numeric())%>%
    step_scale(all_numeric())

# Control parameters ----
set.seed(0)
TC = trainControl("cv",number = 10, savePredictions = "final", classProbs = TRUE, returnResamp = "final")


set.seed(0)
Model.Output = train(Rec.Obj,
                     credit_train,
                     trControl = TC,
                     tuneLength = 1,
                     metric = "Accuracy",
                     method = "glm")

# Preped recipe ----
set.seed(0)
prep.rec <- 
    prep(Rec.Obj, newdata = credit_train)

# Baked data for observation ----
set.seed(0)
bake.train <- bake(prep.rec, new_data = credit_train)
bake.test <- bake(prep.rec, new_data = credit_test)

# investigation of prediction methods ----

# no application of recipe to newdata
set.seed(0)
predict.norm = predict(Model.Output, credit_test, type = "raw")
predict.train = predict.train(Model.Output, credit_test,  type = "raw")

identical(predict.norm,predict.train)
# evaluates to FALSE

# Apply recipe to new data (bake.test)
predict.norm.baked = predict(Model.Output, bake.test, type = "raw")
predict.train.baked = predict.train(Model.Output, bake.test, type = "raw")

identical(predict.norm.baked, predict.train.baked)
# evaluates to FALSE

# Comparison of both predict() funcs
identical(predict.norm, predict.norm.baked)
# evaluates to FALSE

1 个答案:

答案 0 :(得分:0)

该食谱已嵌入train对象中。答案不同有两个原因:

  1. 由于您要提供配方(在Model.Output内)要重新处理的数据。您不应该提供predict()个烘烤的数据;只需使用predict()并为其提供原始测试集即可。

  2. 让S3做它的事:predict.train用于x / y接口,predict.train.recipe用于配方接口。仅使用predict()就可以了。