Question

如果我使用Weka中的任何算法，我会得到以下格式的结果：

=== Stratified cross-validation ===
=== Summary ===

Correctly Classified Instances         302               63.3124 %
Incorrectly Classified Instances       175               36.6876 %
Kappa statistic                          0.3536
Mean absolute error                      0.3464
Root mean squared error                  0.4176
Relative absolute error                 85.5832 %
Root relative squared error             92.8684 %
Total Number of Instances              477     

=== Detailed Accuracy By Class ===

           TP Rate   FP Rate   Precision   Recall  F-Measure   ROC Area  Class
             0.801     0.407      0.686     0.801     0.739      0.659    1
             0.748     0.243      0.549     0.748     0.633      0.718    2
             0         0          0         0         0          0.478    3
Weighted Avg.    0.633     0.283      0.516     0.633     0.568      0.641

=== Confusion Matrix ===

     a   b   c   <-- classified as
   201  50   0 |   a = 1
    34 101   0 |   b = 2
    58  33   0 |   c = 3

但如果我使用k-means，我的结果是以下格式：

=== Model and evaluation on training set ===


kMeans
======

Number of iterations: 9
Within cluster sum of squared errors: 297.46622082142716
Missing values globally replaced with mean/mode

Cluster centroids:
                            Cluster#
Attribute        Full Data         0         1         2
                     (477)     (136)     (172)     (169)
========================================================
Religion            8.6939    7.6691    8.9709    9.2367
Vote_Criterion      2.7736    2.8971    2.4942    2.9586
Sex                 1.4906    1.4559         2         1
DateBirth        1930.7652 1937.5147 1920.2965 1935.9882
Educ                3.2201    3.2721    3.2209    3.1775
Immigrant           1.6415    1.6838    1.5872    1.6627
Income              2.4675       2.5    2.5523     2.355
Occupation          3.6184    3.8162    3.2907    3.7929
Vote2013                 1         2         1         1




Time taken to build model (full training data) : 0.06 seconds

=== Model and evaluation on training set ===

    Clustered Instances

    0       136 ( 29%)
    1      172 ( 36%)
    2      169 ( 35%)

..但我想知道正确分类的实例，精度，召回等，正如其他算法所示。为什么会发生这种情况，我怎样才能让weka以k-means的第一种格式显示结果？< / p>

Answer 1

K-Means本身就是一个聚类算法：

聚类分析或聚类是分组一组的任务对象以同一组中的对象（称为簇）的方式彼此更相似（在某种意义上或另一种意义上）其他组（集群）中的那些

所以它没有“类”的概念，因此不用于分类（当然可以做，但性能可能不太好）。你确定你在这里正确使用它吗？

另外，请参阅here（粗体是我的）：

您可以按顺序使用元分类器 ClassificationViaClustering 在监督环境中使用聚类器。

Answer 2

在这种情况下，ClassificationViaClustering可以使用元分类器。在WEKA 3.8中，必须单独通过包管理器下载。希望这会有所帮助。

Weka中K-means算法的不同结果

2 个答案: