参数的'type'(字符)无效

时间:2016-04-15 07:13:00

标签: r classification text-classification naivebayes document-classification

这是关于该主题的错误消息。当我尝试运行naive.bayes分类器时,我收到此错误。以下是我的火车数据摘要:

'data.frame':   7269 obs. of  193 variables:
 $ pid       : int  2 4 5 7 10 11 14 18 25 31 ...
 $ acquir    : int  0 0 0 0 1 1 0 0 0 0 ...
 $ addit     : int  0 0 0 0 2 2 0 0 0 0 ...
 $ agre      : int  0 0 0 0 0 0 0 0 0 0 ...
 $ agreement : int  0 0 0 0 0 0 0 0 0 0 ...
 $ also      : int  1 0 0 0 2 2 0 0 0 0 ...
 $ american  : int  0 0 0 0 0 0 0 0 0 0 ...
 $ announc   : int  0 0 0 0 0 0 0 0 0 0 ...
 $ annual    : int  0 0 0 0 0 0 0 0 2 0 ...
 $ approv    : int  0 3 0 0 0 0 0 0 0 0 ...
 $ april     : int  0 0 0 0 0 0 0 0 1 0 ...
 $ bank      : int  0 7 0 0 0 0 0 0 0 0 ...
 $ base      : int  0 0 0 0 0 0 0 0 0 0 ...
 .
 .
 $... all of them are integer, except the class column
 .
 .
 $ class     : Factor w/ 10 levels "acq","corn","crude",..: 1 1 4 4 9 1 4 3 1 4 ...

这是naive.bayes()行:

model <- naiveBayes(as.factor(class) ~ ., data = as.matrix(train), laplace = 3)

谁能告诉我为什么会这样?:

Error in sum(x) : invalid 'type' (character) of argument

1 个答案:

答案 0 :(得分:1)

由于as.matrix(train),您的数据最终会转换为字符。尝试

model <- naiveBayes(class ~ ., data=train, laplace = 3)

或最终

model <- naiveBayes(train$class ~ ., data=train[, -c("class")], laplace = 3)

第二个变体与第一个变体或多或少相同。公式RHS中的.扩展为“所有其他变量”;所以它排除了LHS中提到的专栏class。 (更多信息请参见formula

的文档