我已经尝试了在堆栈溢出中为数据建议的所有可能解决方案,并且参考应该是具有相同水平的因素。
y
我仍然收到“错误:new_doc
和set.seed(10)
indices = sample.split(consumers$label, SplitRatio = 0.75)
train = consumers[indices,]
test = consumers[!(indices),]
##Build a logistic regression model
is.factor(train$label)
contrasts(train$label)
lr_model <- data.frame(label = as.numeric(rnorm(100)>0.5), b= rnorm(100), c = rnorm(100), d = rnorm(100))
logitMod <- glm(label ~ ., data=train, family=binomial(link="logit"))
pdata <- predict(logitMod, newdata = train, type = "response")
confusionMatrix(data = as.numeric(pdata>0.5), reference = train$label)
应该是具有相同水平的因子。”
我的数据集有三列-口粮,时间和标签(标签是男性和女性)
答案 0 :(得分:1)
在这里预感到您正在使用caret::confusionMatrix
,所以这里就去了。您正在做的是传递一个整数作为数据,将因子作为参考。请注意,文档要求使用预测类或表的 factor 。
> library(caret)
>
> ref <- factor(sample(0:1, size = 100, replace = TRUE))
> data1 <- sample(0:1, size = 100, replace = TRUE)
> data2 <- factor(sample(0:1, size = 100, replace = TRUE))
# this is your case
> confusionMatrix(data = data1, reference = ref)
Error: `data` and `reference` should be factors with the same levels.
# pass in a factor (try a table for giggles)
> confusionMatrix(data = data2, reference = ref)
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 24 19
1 33 24
Accuracy : 0.48
95% CI : (0.379, 0.5822)
No Information Rate : 0.57
P-Value [Acc > NIR] : 0.97198
Kappa : -0.02
Mcnemar's Test P-Value : 0.07142
Sensitivity : 0.4211
Specificity : 0.5581
Pos Pred Value : 0.5581
Neg Pred Value : 0.4211
Prevalence : 0.5700
Detection Rate : 0.2400
Detection Prevalence : 0.4300
Balanced Accuracy : 0.4896
'Positive' Class : 0
答案 1 :(得分:0)
confusionMatrix(data = as.factor(as.numeric(pdata>0.5)), reference = train$label)
这应该有效。