将探针转换为类并显示频率

Question

我有一个csv文件，其中包含估计的概率和实际结果。我想为估计的概率使用阈值0.5创建一个混淆矩阵，但我不断收到错误消息“错误：data和reference应该是具有相同水平的因子”。怎么了？参见下面的代码

我试图编写代码

将探针转换为类并显示频率

p_class = ifelse (probs_truth$estimated > 0.5, 1, 0)
table(p_class)

计算混淆矩阵

predicted = p_class
actual = probs_truth$truth

library(caret)
result = confusionMatrix (data=predicted, reference=actual)
print(result)

我希望返回混淆矩阵表

Answer 1

后续代码对我有用，希望对您有所帮助：我做了一个小的数据集，估计它与您的数据相似。

library(data.table)
probs_truth <- data.table(estimated = c(0.5, 0.3, 0.7, 0.8, 0.1), actual = c(1, 0, 0, 1, 0))

根据ifelse语句（'estimated2'）在数据集中添加一列。

probs_truth$estimated2 = ifelse (probs_truth$estimated > 0.5, 1, 0)

确保“ estimated2”和“ actual”是因素。

probs_truth$estimated2 <- as.factor(probs_truth$estimated2)
probs_truth$actual <- as.factor(probs_truth$actual)

head(probs_truth)

library(caret)
result = confusionMatrix (data=probs_truth$estimated2, reference=probs_truth$actual)
print(result)

错误：“数据”和“参考”应该是具有相同水平的因子，但不会返回混淆矩阵

将探针转换为类并显示频率

计算混淆矩阵

1 个答案: