R中的朴素贝叶斯预测以读数字符为因子而无因子

时间:2016-07-21 05:53:19

标签: r naivebayes

我试图在Mushroom Data set上使用Naive bayes。数据集为8124*23,第一列为响应变量{'edible','poisonous'}。我已经删除了丢失的数据。最后,数据集为5644*23。以下是我使用的代码。

mushroom.data <- read.csv("mushroom.data",header = FALSE, stringsAsFactors = FALSE)
#mushroom.data <- read.csv("mushroom.data",header = FALSE, stringsAsFactors = TRUE)

#Eliminating missing data
mushroom.data <- subset(mushroom.data,mushroom.data$V12 != '?')
# Factoring target class
mushroom.data$V1 <- as.factor(mushroom.data$V1)
# First 4000 records as Training set. 
mushroom.train.class <- mushroom.data[1:4000,1]
mushroom.train.data <- mushroom.data[1:4000,-1]
# Building naive bayes classifier
nb.model <- naiveBayes(mushroom.train.data,mushroom.train.class,laplace = 1)
# Last 1644 are Test records
mushroom.test.data <- mushroom.data[4001:5644,-1]
mushroom.test.class <- mushroom.data[4001:5644,1]
# Predicition
nb.pred <- predict(nb.model,mushroom.test.data)
# checking proportions of the predictions
prop.table(table(nb.pred))

该模型使用stringAsFactors = FALSE将所有内容预测到edible class,精度为10-15%,使用stringAsFactors = TRUE,精度为91%。保理会发生什么?

编辑1:更改了标题。原来的问题解决了。

1 个答案:

答案 0 :(得分:1)

你无法用NaiveBayes塑造角色。检查?NaiveBayes并注意参数部分。