我正在尝试使用weka的逻辑回归。有没有办法告诉weka尝试最小化某种类型的错误?我不介意将更多的错误归类为b,但我希望尽量减少被归类为a的b的数量。
这是我的输出:
Logistic Regression with ridge parameter of 1.0E-8
Coefficients...
Class
Variable yes
=========================================
cmapArithAvg 28.9022
cnllArithAvg 1.8342
cmapGeoAvg -92.0111
cnllGeoAvg -0.6321
avgCatchAllScorer 0
cmapMin -15333.0622
cmapMinInternal 15210.7515
cnllMin 0.0267
cmapStdev -0.9583
cnllStdev -2.0748
numphones 0.3234
Intercept 12.3432
Odds Ratios...
Class
Variable yes
=========================================
cmapArithAvg 3.564876537642066E12
cnllArithAvg 6.2601
cmapGeoAvg 0
cnllGeoAvg 0.5315
avgCatchAllScorer 1
cmapMin 0
cmapMinInternal Infinity
cnllMin 1.0271
cmapStdev 0.3835
cnllStdev 0.1256
numphones 1.3818
Time taken to build model: 0.67 seconds
Time taken to test model on training data: 0.28 seconds
=== Error on training data ===
Correctly Classified Instances 11383 95.2791 %
Incorrectly Classified Instances 564 4.7209 %
Kappa statistic 0.7434
Mean absolute error 0.0723
Root mean squared error 0.1883
Relative absolute error 36.4503 %
Root relative squared error 59.8021 %
Total Number of Instances 11947
=== Confusion Matrix ===
a b <-- classified as
10442 171 | a = yes
393 941 | b = no
=== Stratified cross-validation ===
Correctly Classified Instances 11376 95.2206 %
Incorrectly Classified Instances 571 4.7794 %
Kappa statistic 0.7401
Mean absolute error 0.0726
Root mean squared error 0.189
Relative absolute error 36.5861 %
Root relative squared error 60.0198 %
Total Number of Instances 11947
=== Confusion Matrix ===
a b <-- classified as
10439 174 | a = yes
397 937 | b = no
答案 0 :(得分:1)
您可以尝试成本敏感分类。您可以定义一个成本矩阵,为您希望最小化的错误分配更大的成本,并且由于大多数分类器尝试最小化平均错误,他们将尽力避免这些错误。
您可以使用元分类器CostSensitiveClassifier
在WEKA中执行此操作。 weka Explorer中的一个示例显示在this blog post。