任何人都可以建议我,并帮助我为SVM模型创建混淆矩阵,因为出现以下错误:
"Error: 'data' and 'reference' should be factors with the same levels."
来自下面的混淆矩阵代码...
confusionMatrix(predA, tmp_test$Score)
我也尝试过
confusionMatrix(table(predA, tmp_test))
然后我得到了以下错误...
"Error in table(predA, tmp_test) : all arguments must have the same length"
SVM模型是回归模型。
样品表...
Unhelpful Score
7 1
8 3
5 1
7 2
4 1
4 1
5 1
9 2
6 1
5 1
11 3
有2108个obs和2个变量。没有丢失或无效的数据或0(零)值。无用值的范围是4到2016。得分值的范围是1到3。
这是我的代码...
# Random sampling
samplesize = 0.60 * nrow(dsTemp)
set.seed(80)
index = sample(seq_len(nrow(dsTemp)), size = samplesize)
# Create training and test set
datatrain = dsTemp[ index, ]
datatest = dsTemp[ -index, ]
library(caret)
library(e1071)
library(tidyverse)
tmp_train <-datatrain
tmp_test <- datatest
#orginally datatypes were int but I had to change to factor for the model
#to work
dsTemp$Score <- factor(dsTemp$Score)
dsTemp$Unhelpful <- factor(dsTemp$Unhelpful)
dsTemp$Unhelpful <- factor(dsTemp$Unhelpful)
dsTemp$Score <- factor(dsTemp$Score)
#svm model
Model <- svm(Score ~., data=tmp_train,kernel='linear',gamma=0.2,cost=100)
#predictions
predA <- predict(svmModel, tmp_test)
编辑
tmp_train$Score <- factor(tmp_train$Score)
tmp_test$Score <- factor(tmp_test$Score)
tmp_train$HelpfulnessDenominator <- factor(tmp_train$HelpfulnessDenominator)
tmp_test$HelpfulnessDenominator <- factor(tmp_test$HelpfulnessDenominator)
在
之后出错confusionMatrix(predA, tmp_test)
或
confusionMatrix(table(predA, tmp_test))
str(predA)
Factor w/ 3 levels "1","2","3": 1 1 1 1 1 1 1 1 1 1 ...
- attr(*, "names")= chr [1:1264] "927" "1179" "1655" "156" …
str(tmp_test$Score)
Factor w/ 3 levels "1","2","3": 1 3 3 3 1 1 1 2 2 3 ...
答案 0 :(得分:0)
好像您在训练和测试集中都没有更改为factors
,而是在dsTemp
中更改为:
dsTemp$Score <- factor(dsTemp$Score)
dsTemp$Unhelpful <- factor(dsTemp$Unhelpful)
dsTemp$Unhelpful <- factor(dsTemp$Unhelpful)
dsTemp$Score <- factor(dsTemp$Score) #also this is just a repetition
相反,它应该是:
tmp_train$Score <- factor(tmp_train$Score)
tmp_test$Score <- factor(tmp_test$Score)
因为这些是您稍后要调用的数据集:
#svm model
Model <- svm(Score ~., data=tmp_train,kernel='linear',gamma=0.2,cost=100)
#predictions
predA <- predict(svmModel, tmp_test)
这是confusionMatrix
的正确调用:
confusionMatrix(predA, tmp_test$Score)