R中的K-fold,每次迭代后如何存储?

时间:2017-03-27 08:52:00

标签: r iteration

如果我想在R中每次迭代后保存testData和trainData,我怎么能这样做?

#Create 10 equally size folds
folds <- cut(seq(1,nrow(cre_card)),breaks=10,labels=FALSE)

#Perform 10 fold cross validation
for(i in 1:10){
    #Segement your data by fold using the which() function 
    testIndexes <- which(folds==i,arr.ind=TRUE)
    testData <- cre_card[testIndexes, ]
    trainData <- cre_card[-testIndexes, ]
    #Use the test and train data partitions however you desire...

}

2 个答案:

答案 0 :(得分:2)

我建议您使用列表来存储所有集合。

您可以使用以下代码。

folds <- cut(seq(1,nrow(cre_card)),breaks=10,labels=FALSE)
test_sets <- list()
train_sets <- list()
#Perform 10 fold cross validation
for(i in 1:10){
  #Segement your data by fold using the which() function 
  testIndexes <- which(folds==i,arr.ind=TRUE)
  testData <- cre_card[testIndexes, ]
  trainData <- cre_card[-testIndexes, ]
  #Use the test and train data partitions however you desire...
  test_sets <- c(test_sets,list(testData))
  train_sets <- c(train_sets,list(trainData))
}

然后,您可以使用test_sets[[i]]train_sets[[i]]的第i对火车/测试数据集。

答案 1 :(得分:0)

使用modelr-package,您可以执行以下操作:

require(modelr)
dat <- cars
kcv <- crossv_kfold(dat, k = 10)

kcv现在看起来像这样:

# A tibble: 10 × 3
           train           test   .id
          <list>         <list> <chr>
1 <S3: resample> <S3: resample>    01
2 <S3: resample> <S3: resample>    02
...

训练你可以做的模型:

models <- lapply(kcv$train, function(x) lm(dist ~ speed, data = x))

使用rmse - 包中的modelr函数,您可以按如下方式计算rmse:

unlist(Map(rmse, models, kcv$test))

P.S。:此示例基于?modelr::crossv_kfold