使用“ rf”方法的火车功能收到错误消息

时间:2019-08-14 06:03:15

标签: r r-caret

我尝试了发布于的示例 this site,然后按照确切的步骤操作,直到使用train函数为止。

library(dplyr)

data_train <- read.csv("https://raw.githubusercontent.com/guru99-edu/R- 
    Programming/master/train.csv")

glimpse(data_train)

data_test <- read.csv("https://raw.githubusercontent.com/guru99-edu/R-    
    Programming/master/test.csv") 

glimpse(data_test)

library(randomForest)

library(caret)

library(e1071)

trControl <- trainControl(method = "cv",
    number = 10,
    search = "grid")

set.seed(1234)

rf_default <- train(Survived~., 
    data = data_train,
    method = "rf",
    metric = "Accuracy",
    trControl = trControl)

我用过

R versions 3.5.1 and 3.6.1
  

na.fail.default(list(Survived = c(0L,1L,1L,1L,0L,0L,0L,:     对象中缺少值。但是,“生存”变量中没有缺失值。

有人可以告诉我怎么了吗?我使用R版本3.5.1,并尝试在3.6.1上进行。谢谢

1 个答案:

答案 0 :(得分:2)

有几个问题。首先是您那里有NA个。您可以估算这些值,也可以忽略它们。为了简单起见,我省略了它们。

第二,您需要使用一个因素进行分类。 set.seed(1234)

new_data<-na.omit(data_train)
as_tibble(new_data) %>% 
  mutate(Survived = as.factor(Survived)) -> new_data
rf_default <- train(Survived~., 
                    data = new_data,
                    method = "rf",
                    metric = "Accuracy",
                    trControl = trControl)