Java / WEKA:K群集错误:无法处理任何类属性

时间:2014-12-17 16:51:03

标签: java weka

    SimpleKMeans kmeans = new SimpleKMeans();
    int numberOfClusters = 2;
    int[] assignments = null;

    kmeans.setSeed(10);

    // This is the important parameter to set
    kmeans.setPreserveInstancesOrder(true);
    try {
        kmeans.setNumClusters(numberOfClusters);
        kmeans.buildClusterer(instancesOne); // <-- exception being thrown
        // This array returns the cluster number (starting with 0) for each instance
        // The array has as many elements as the number of instances
        assignments = kmeans.getAssignments();
    } catch (Exception e) {
        e.printStackTrace();
    }

我试图通过k-means算法初始化EM算法的参数。所以我试图获得2个质心,我可以进一步训练GMM的参数。但是我收到以下错误:

weka.core.WekaException: weka.clusterers.SimpleKMeans: Cannot handle any class attribute!
    at weka.core.Capabilities.test(Unknown Source)
    at weka.core.Capabilities.test(Unknown Source)
    at weka.core.Capabilities.testWithFail(Unknown Source)
    at weka.clusterers.SimpleKMeans.buildClusterer(Unknown Source)
    at hmm.HMM.run(HMM.java:62)
    at hmm.HMM.main(HMM.java:22)
Exception in thread "main" java.lang.NullPointerException
    at hmm.HMM.run(HMM.java:71)
    at hmm.HMM.main(HMM.java:22)

另外,我如何设置两个随机质心。我认为setSeed()方法可以做到这一点,但是如何使用我的数据集将其写入?我的csv文件看起来如此:

enter image description here

然后加载它:

Instances instancesOne = loader.loadCsv("train", "class1");

以下是加载时属性的一些信息:

dataset:

@relation class1

@attribute x numeric
@attribute y numeric

@data
-9.0278,3.1518
-9.5656,3.6383
-9.805,3.8284
etc...

回答,需要此代码才能使Instances类更少(删除class属性):

// remove class attribute, make class-less
Instances dataClusterer = null;
weka.filters.unsupervised.attribute.Remove filter = new weka.filters.unsupervised.attribute.Remove();
filter.setAttributeIndices("" + (instancesOne.classIndex() + 1));
try {
    filter.setInputFormat(instancesOne);
    dataClusterer = Filter.useFilter(instancesOne, filter);
} catch (Exception e1) {
    e1.printStackTrace();
    return;
}

1 个答案:

答案 0 :(得分:0)

我不相信K-Means Clustering需要一个class属性。如果您为实例设置了一个,请尝试将其删除并重新运行代码。 This guide可能有助于构建聚类模型的方法。

希望这有帮助!