Question

我的项目的目标是预测某些文本描述的准确性。

我用FASTTEXT制作了矢量。

TSV输出：

 label lenght
1     0   1:43
2     0   1:10
3     0    1:8
4     0  1:110
5     1  1:105
6     0  1:446

然后我通过以下脚本使用库e1071处理.tsv文件：


library (e1071)

accuracy.model = read.table(file = 'file.tsv', sep = '\t', header = FALSE, col.names= c( "label", "lenght" ))

head(accuracy.model)

classifier = svm( formula = label ~ .,
                  data = accuracy.model,
                  type = 'C-classification',
                  kernel = 'radial',
                  cost = 32,
                  gamma = 8,
                  cross  = 10)

classifier

进行交叉验证（10次折叠）后，我可以检索出总体准确度百分比。

我还希望获得F1得分，准确性和召回价值。

对于混淆矩阵，我经历了其他一些堆栈线程，我发现这必须使用插入符号库来完成，但我不知道该怎么做。

建议？

致谢

Answer 1

假设我们适合这样的模型：

library(caret)
library(e1071)
data=iris
data$Species = ifelse(data$Species=="versicolor","v","o")

classifier = svm( formula = Species ~ .,
                  data = data,
                  type = 'C-classification',
                  kernel = 'radial',
                  cost = 32,
                  gamma = 8,
                  cross  = 10)

然后我们得到混淆矩阵：

mat = table(classifier$fitted,data$Species)

并应用插入符号功能：

confusionMatrix(mat)$byClass

         Sensitivity          Specificity       Pos Pred Value 
           1.0000000            1.0000000            1.0000000 
      Neg Pred Value            Precision               Recall 
           1.0000000            1.0000000            1.0000000 
                  F1           Prevalence       Detection Rate 
           1.0000000            0.6666667            0.6666667 
Detection Prevalence    Balanced Accuracy 
           0.6666667            1.0000000

您可以将其应用于您的情况。

如何获得F1，精度，召回率和混淆矩阵

1 个答案: