使用R的Naive Bayes的10倍交叉验证中的回归误差的错误模型类型

时间:2014-04-29 07:09:24

标签: r machine-learning

我正在对2个类(0和1)的一些测试数据实施Naive Bayes的10倍交叉验证。 我按照以下步骤操作并收到错误。

data(testdata)

attach(testdata)

X <- subset(testdata, select=-Class)

Y <- Class

library(e1071)

naive_bayes <- naiveBayes(X,Y)

library(caret)
library(klaR)

nb_cv <- train(X, Y, method = "nb", trControl = trainControl(method = "cv", number = 10))

## Error:
## Error in train.default(X, Y, method = "nb", trControl = trainControl(number = 10)) : 
## wrong model type for regression


dput(testdata)

structure(list(Feature.1 = 6.534088, Feature.2 = -19.050915, 
Feature.3 = 7.599378, Feature.4 = 5.093594, Feature.5 = -22.15166, 
Feature.6 = -7.478444, Feature.7 = -59.534652, Feature.8 = -1.587918, 
Feature.9 = -5.76889, Feature.10 = 95.810563, Feature.11 = 49.124086, 
Feature.12 = -21.101489, Feature.13 = -9.187984, Feature.14 = -10.53006, 
Feature.15 = -3.782506, Feature.16 = -10.805074, Feature.17 = 34.039509, 
Feature.18 = 5.64245, Feature.19 = 19.389724, Feature.20 = 16.450196, 
Class = 1L), .Names = c("Feature.1", "Feature.2", "Feature.3", 
"Feature.4", "Feature.5", "Feature.6", "Feature.7", "Feature.8", 
"Feature.9", "Feature.10", "Feature.11", "Feature.12", "Feature.13", 
"Feature.14", "Feature.15", "Feature.16", "Feature.17", "Feature.18", 
"Feature.19", "Feature.20", "Class"), class = "data.frame", row.names = c(NA, 
-1L))

此外,如何计算此模型的R square或AUC

数据集:有10000条记录,包含20个要素和二进制类。

3 个答案:

答案 0 :(得分:10)

NaiveBayes是一个分类器,因此将Y转换为因子或布尔值是解决问题的正确方法。您的原始配方使用分类器工具,但使用数值,因此R混淆。

就R-square而言,该指标仅针对回归问题计算而非针对分类问题进行计算。要评估分类问题,还有其他指标,如Precision和Recall。

有关这些指标的更多信息,请参阅维基百科链接: http://en.wikipedia.org/wiki/Binary_classification

答案 1 :(得分:4)

在更改标签矢量Y&lt; - as.factor(Y)

之后工作

答案 2 :(得分:0)

添加到您的结构

colClasses=c("Class"="character")