简单的K-Means不处理iris.arff

时间:2011-05-13 10:01:07

标签: java weka

我下面有这个类,我考虑wiki和论文中给出的例子构建它,为什么SympleKMeans不能处理数据?该类可以打印Datasource dados,因此处理文件没有任何问题,错误在构建上。

package slcct;

import weka.clusterers.ClusterEvaluation;
import weka.clusterers.SimpleKMeans;
import weka.core.Instance;
import weka.core.Instances;
import weka.core.converters.ConverterUtils.DataSource;


public class Cluster {

public String path;
public Instances dados;
public String[] options = new String[2];

public Cluster(String caminho, int nclusters, int seed ){
    this.path = caminho;
    this.options[0] = String.valueOf(nclusters);
    this.options[1] = String.valueOf(seed);

}

public void ledados() throws Exception{

    DataSource source = new DataSource(path);
    dados = source.getDataSet();
    System.out.println(dados)

    if(dados.classIndex()==-1){
        dados.setClassIndex(dados.numAttributes()-1);
    }
}

public void imprimedados(){
    for(int i=0; i<dados.numInstances();i++)
    {
        Instance actual = dados.instance(i);
        System.out.println((i+1) + " : "+ actual);
    }
}

public void clustering() throws Exception{

    SimpleKMeans cluster = new SimpleKMeans();
    cluster.setOptions(options);
    cluster.setDisplayStdDevs(true);
    cluster.getMaxIterations();
    cluster.buildClusterer(dados);

    Instances ClusterCenter = cluster.getClusterCentroids();
    Instances SDev = cluster.getClusterStandardDevs();
    int[] ClusterSize = cluster.getClusterSizes(); 

    ClusterEvaluation eval = new ClusterEvaluation();
    eval.setClusterer(cluster);
    eval.evaluateClusterer(dados);

    for(int i=0;i<ClusterCenter.numInstances();i++){
        System.out.println("Cluster#"+( i +1)+ ": "+ClusterSize[i]+" dados .");
        System.out.println("Centróide:"+ ClusterCenter.instance(i));
        System.out.println("STDDEV:" + SDev.instance(i));
        System.out.println("Cluster Evaluation:"+eval.clusterResultsToString());

    }

}
}

错误:

weka.core.WekaException: weka.clusterers.SimpleKMeans: Cannot handle any class attribute!

at weka.core.Capabilities.test(Capabilities.java:1097)    
at weka.core.Capabilities.test(Capabilities.java:1018)   
at weka.core.Capabilities.testWithFail(Capabilities.java:1297) 
at weka.clusterers.SimpleKMeans.buildClusterer(SimpleKMeans.java:228)    
at slcct.Cluster.clustering(Cluster.java:53)//Here.    
at slcct.Clustering.jButton1ActionPerformed(Clustering.java:104)

3 个答案:

答案 0 :(得分:7)

我相信你不需要设置类索引,因为你正在进行聚类而不是分类。请尝试关注this guide for programmatic Java clustering

答案 1 :(得分:3)

在“ledados()”函数中,只需删除下面给出的代码块。它会工作。因为您的数据中没有已定义的类。

if(dados.classIndex()==-1){
    dados.setClassIndex(dados.numAttributes()-1);
}

您的新功能:

public void ledados() throws Exception{

DataSource source = new DataSource(path);
dados = source.getDataSet();
System.out.println(dados) }

答案 2 :(得分:0)

执行k群集时,您不需要数据中的类属性