Weka - 限制某种类型的分类错误

时间:2014-01-12 08:15:11

标签: weka

我正在尝试使用weka的逻辑回归。有没有办法告诉weka尝试最小化某种类型的错误?我不介意将更多的错误归类为b,但我希望尽量减少被归类为a的b的数量。

这是我的输出:

Logistic Regression with ridge parameter of 1.0E-8
Coefficients...
                                    Class
Variable                              yes
=========================================
cmapArithAvg                      28.9022
cnllArithAvg                       1.8342
cmapGeoAvg                       -92.0111
cnllGeoAvg                        -0.6321
avgCatchAllScorer                       0
cmapMin                       -15333.0622
cmapMinInternal                15210.7515
cnllMin                            0.0267
cmapStdev                         -0.9583
cnllStdev                         -2.0748
numphones                          0.3234
Intercept                         12.3432


Odds Ratios...
                                    Class
Variable                              yes
=========================================
cmapArithAvg         3.564876537642066E12
cnllArithAvg                       6.2601
cmapGeoAvg                              0
cnllGeoAvg                         0.5315
avgCatchAllScorer                       1
cmapMin                                 0
cmapMinInternal                  Infinity
cnllMin                            1.0271
cmapStdev                          0.3835
cnllStdev                          0.1256
numphones                          1.3818


Time taken to build model: 0.67 seconds
Time taken to test model on training data: 0.28 seconds

=== Error on training data ===

Correctly Classified Instances       11383               95.2791 %
Incorrectly Classified Instances       564                4.7209 %
Kappa statistic                          0.7434
Mean absolute error                      0.0723
Root mean squared error                  0.1883
Relative absolute error                 36.4503 %
Root relative squared error             59.8021 %
Total Number of Instances            11947     


=== Confusion Matrix ===

     a     b   <-- classified as
 10442   171 |     a = yes
   393   941 |     b = no



=== Stratified cross-validation ===

Correctly Classified Instances       11376               95.2206 %
Incorrectly Classified Instances       571                4.7794 %
Kappa statistic                          0.7401
Mean absolute error                      0.0726
Root mean squared error                  0.189 
Relative absolute error                 36.5861 %
Root relative squared error             60.0198 %
Total Number of Instances            11947     


=== Confusion Matrix ===

     a     b   <-- classified as
 10439   174 |     a = yes
   397   937 |     b = no

1 个答案:

答案 0 :(得分:1)

您可以尝试成本敏感分类。您可以定义一个成本矩阵,为您希望最小化的错误分配更大的成本,并且由于大多数分类器尝试最小化平均错误,他们将尽力避免这些错误。

您可以使用元分类器CostSensitiveClassifier在WEKA中执行此操作。 weka Explorer中的一个示例显示在this blog post