weka能够通过聚类使用分类

时间:2014-03-16 13:11:58

标签: cluster-analysis classification weka

我是WEKA工具的新手。我可以结合分类和聚类吗?即首先对数据进行聚类,然后对聚类实例进行分类。对于这个要求,需要遵循的步骤是什么。

提前致谢。

1 个答案:

答案 0 :(得分:2)

是的,你可以。使用ClassificationViaClustering分类器(Class ClassificationViaClustering)非常简单。

Java伪代码的步骤:
1.创建SimpleKMeans聚类器

SimpleKMeans skm = new SimpleKMeans();
skm.setNumClusters(5); // in this example the clusterer uses 5 clusters

2。读取数据集并设置类索引

BufferedReader reader = new BufferedReader(new FileReader("[path].arff")); // replace [path] with your path to dataset
Instances data = new Instances(reader);
data.setClassIndex([your class index]); // if the first attribute is your class, then insert 0  

3。创建分类器

ClassifierViaClustering cvc = new ClassificationViaClustering();
cvc.setClusterer(skm); // let your classifier use the SimpleKMeans clusterer
cvc.buildClassifier(data);

然后,当您想要对新实例进行分类时:

Instance instanceToClassify = new Instance(data.firstInstance());
instanceToClassify.setDataset(data); // the instance to be classified has to have access to the dataset
double class = cvc.classifyInstance(instanceToClassify); // classify instance based by the cluster it belongs to