Question

我正在编写一个使用流量数据的代码，将其存储在OD Matrix中，并将其显示为热图。我正在尝试集群（k-means for now）它，但出于某种原因我的实例＆＃39;作业只是零。这是我的代码：

public class Clustering {

Vector <ODData> myData = new Vector <ODData>();
int capacity;
int ROUTES_SIZE = 324263;

public Clustering( ODData [] routes, int cap )
{
    capacity = cap;
    for (ODData s : routes) 
        {
            myData.add(s);
        }       
}

public void Build_cluster()
{

    Attribute x1 = new Attribute("Beginning point x"); 
    Attribute y1 = new Attribute("Beginning point y");
    Attribute x2 = new Attribute("End point x"); 
    Attribute y2 = new Attribute("End point y");
    Attribute dem = new Attribute("Demand");

    ArrayList <Attribute> attribute_list = new ArrayList <Attribute>(5); 
    attribute_list.add(x1); 
    attribute_list.add(y1); 
    attribute_list.add(x2);
    attribute_list.add(y2); 
    attribute_list.add(dem); 

    Instances attribute_instance = new Instances ("Cluster", attribute_list, capacity);

    double [] temp_array = new double[5];

    for (int i = 0; i < myData.size(); i++)
    {
        ODData s;
        s = myData.get(i);
        temp_array[0] = s.getOrigin().getLattitude();
        temp_array[1] = s.getOrigin().getLongititude();
        temp_array[2] = s.getDestination().getLattitude();
        temp_array[3] = s.getDestination().getLongititude();
        temp_array[4] = s.getValue();

        Instance inst = new DenseInstance( 1, temp_array );
        attribute_instance.add(inst);
    }

    SimpleKMeans Kmeans_clustering = new SimpleKMeans();
    Kmeans_clustering.setPreserveInstancesOrder(true);

    try {
        Kmeans_clustering.buildClusterer(attribute_instance);
    } catch (Exception e1) {
        // TODO Auto-generated catch block
        e1.printStackTrace();
    }


    /*Sorting by groups*/


    int[] assignments = new int[ROUTES_SIZE];
    try {
        assignments = Kmeans_clustering.getAssignments();
    } catch (Exception e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }

}

知道我为什么只得到零？

Answer 1

对于k-means，您需要指定群集数。

我猜这个默认为1个群集（ID为0）。

P.S。对于你的数据（纬度，经度，需求），k-means并没有多大意义。您需要定义一个距离，根据您的数据需要测量相似度（特定于数据！），然后使用基于距离的聚类算法。

在simpleKMeans聚类中仅将零作为赋值

1 个答案: