如果我想在R中每次迭代后保存testData和trainData,我怎么能这样做?
#Create 10 equally size folds
folds <- cut(seq(1,nrow(cre_card)),breaks=10,labels=FALSE)
#Perform 10 fold cross validation
for(i in 1:10){
#Segement your data by fold using the which() function
testIndexes <- which(folds==i,arr.ind=TRUE)
testData <- cre_card[testIndexes, ]
trainData <- cre_card[-testIndexes, ]
#Use the test and train data partitions however you desire...
}
答案 0 :(得分:2)
我建议您使用列表来存储所有集合。
您可以使用以下代码。
folds <- cut(seq(1,nrow(cre_card)),breaks=10,labels=FALSE)
test_sets <- list()
train_sets <- list()
#Perform 10 fold cross validation
for(i in 1:10){
#Segement your data by fold using the which() function
testIndexes <- which(folds==i,arr.ind=TRUE)
testData <- cre_card[testIndexes, ]
trainData <- cre_card[-testIndexes, ]
#Use the test and train data partitions however you desire...
test_sets <- c(test_sets,list(testData))
train_sets <- c(train_sets,list(trainData))
}
然后,您可以使用test_sets[[i]]
和train_sets[[i]]
的第i对火车/测试数据集。
答案 1 :(得分:0)
使用modelr-package,您可以执行以下操作:
require(modelr)
dat <- cars
kcv <- crossv_kfold(dat, k = 10)
kcv
现在看起来像这样:
# A tibble: 10 × 3
train test .id
<list> <list> <chr>
1 <S3: resample> <S3: resample> 01
2 <S3: resample> <S3: resample> 02
...
训练你可以做的模型:
models <- lapply(kcv$train, function(x) lm(dist ~ speed, data = x))
使用rmse
- 包中的modelr
函数,您可以按如下方式计算rmse:
unlist(Map(rmse, models, kcv$test))
P.S。:此示例基于?modelr::crossv_kfold