我尝试了发布于的示例
this site,然后按照确切的步骤操作,直到使用train
函数为止。
library(dplyr)
data_train <- read.csv("https://raw.githubusercontent.com/guru99-edu/R-
Programming/master/train.csv")
glimpse(data_train)
data_test <- read.csv("https://raw.githubusercontent.com/guru99-edu/R-
Programming/master/test.csv")
glimpse(data_test)
library(randomForest)
library(caret)
library(e1071)
trControl <- trainControl(method = "cv",
number = 10,
search = "grid")
set.seed(1234)
rf_default <- train(Survived~.,
data = data_train,
method = "rf",
metric = "Accuracy",
trControl = trControl)
我用过
R versions 3.5.1 and 3.6.1
na.fail.default(list(Survived = c(0L,1L,1L,1L,0L,0L,0L,: 对象中缺少值。但是,“生存”变量中没有缺失值。
有人可以告诉我怎么了吗?我使用R版本3.5.1,并尝试在3.6.1上进行。谢谢
答案 0 :(得分:2)
有几个问题。首先是您那里有NA
个。您可以估算这些值,也可以忽略它们。为了简单起见,我省略了它们。
第二,您需要使用一个因素进行分类。 set.seed(1234)
new_data<-na.omit(data_train)
as_tibble(new_data) %>%
mutate(Survived = as.factor(Survived)) -> new_data
rf_default <- train(Survived~.,
data = new_data,
method = "rf",
metric = "Accuracy",
trControl = trControl)