我是WEKA工具的新手。我可以结合分类和聚类吗?即首先对数据进行聚类,然后对聚类实例进行分类。对于这个要求,需要遵循的步骤是什么。
提前致谢。
答案 0 :(得分:2)
是的,你可以。使用ClassificationViaClustering分类器(Class ClassificationViaClustering)非常简单。
Java伪代码的步骤:
1.创建SimpleKMeans聚类器
SimpleKMeans skm = new SimpleKMeans();
skm.setNumClusters(5); // in this example the clusterer uses 5 clusters
2。读取数据集并设置类索引
BufferedReader reader = new BufferedReader(new FileReader("[path].arff")); // replace [path] with your path to dataset
Instances data = new Instances(reader);
data.setClassIndex([your class index]); // if the first attribute is your class, then insert 0
3。创建分类器
ClassifierViaClustering cvc = new ClassificationViaClustering();
cvc.setClusterer(skm); // let your classifier use the SimpleKMeans clusterer
cvc.buildClassifier(data);
然后,当您想要对新实例进行分类时:
Instance instanceToClassify = new Instance(data.firstInstance());
instanceToClassify.setDataset(data); // the instance to be classified has to have access to the dataset
double class = cvc.classifyInstance(instanceToClassify); // classify instance based by the cluster it belongs to