我有以下函数返回9个数据框:
split_data <- function(dataset, train_perc = 0.6, cv_perc = 0.2, test_perc = 0.2)
{
m <- nrow(dataset)
n <- ncol(dataset)
#Sort the data randomly
data_perm <- dataset[sample(m),]
#Split data into training, CV, and test sets
train <- data_perm[1:round(train_perc*m),]
cv <- data_perm[(round(train_perc*m)+1):round((train_perc+cv_perc)*m),]
test <- data_perm[(round((train_perc+cv_perc)*m)+1):round((train_perc+cv_perc+test_perc)*m),]
#Split sets into X and Y
X_train <- train[c(1:(n-1))]
Y_train <- train[c(n)]
X_cv <- cv[c(1:(n-1))]
Y_cv <- cv[c(n)]
X_test <- test[c(1:(n-1))]
Y_test <- test[c(n)]
}
我的代码运行正常,但没有创建数据框。有没有办法做到这一点?感谢
答案 0 :(得分:2)
如果您希望最后在工作区中创建数据框,那么您需要这样做: -
1) Create empty variable (which may equal out to NULL i.e. Y_test = NULL) in your R console.
2) Assign "<<-" operator to the same variables created in Step 1 inside your function i.e.
X_train <<- train[c(1:(n-1))]
Y_train <<- train[c(n)]
X_cv <<- cv[c(1:(n-1))]
Y_cv <<- cv[c(n)]
X_test <<- test[c(1:(n-1))]
Y_test <<- test[c(n)]
这将使您可以从工作区访问新创建的数据。
答案 1 :(得分:1)
这会将九个data.frames
存储在list
split_data <- function(dataset, train_perc = 0.6, cv_perc = 0.2, test_perc = 0.2) {
m <- nrow(dataset)
n <- ncol(dataset)
#Sort the data randomly
data_perm <- dataset[sample(m),]
# list to store all data.frames
out <- list()
#Split data into training, CV, and test sets
out$train <- data_perm[1:round(train_perc*m),]
out$cv <- data_perm[(round(train_perc*m)+1):round((train_perc+cv_perc)*m),]
out$test <- data_perm[(round((train_perc+cv_perc)*m)+1):round((train_perc+cv_perc+test_perc)*m),]
#Split sets into X and Y
out$X_train <- train[c(1:(n-1))]
out$Y_train <- train[c(n)]
out$X_cv <- cv[c(1:(n-1))]
out$Y_cv <- cv[c(n)]
out$X_test <- test[c(1:(n-1))]
out$Y_test <- test[c(n)]
return(out)
}