使用UPGMA算法聚类数据

时间:2018-04-20 13:07:14

标签: c# algorithm matrix

我正在尝试实现upgma算法来聚类数据“UPGMA算法使用像这个矩阵的距离矩阵构造一个有根树(树状图) Distance Matrix
我使用了这个例子upgma algorithm example

UPGMA算法:
1-找到矩阵中的最小值(例如:17) 2-分组成对元素(a,b) 3-更新基于这样的新组(a,b)的距离矩阵    First Distance matrix Update

The generated new matrix:

4-从步骤1开始重复。

这是我的代码,实际上我在更新矩阵的第3步中停止了我如何用索引标记索引0或任何其他索引换句话说我怎么能知道索引4代表这个组((b) ,c),e)例如。

        Dictionary<int, List<double>> groups = new Dictionary<int, List<double>>();
        int nofgroups=4;
        while (nofgroups > 0)
        {
            int smallest = smallestnumber();
            positionofsmallestnumber = smallestnumposition(smallest);

            groups.Add(nofgroups, new List<double>());
            groups[nofgroups].Add(positionofsmallestnumber.X);
            groups[nofgroups].Add(positionofsmallestnumber.Y);
            groups[nofgroups].Add(smallest);
            groups[nofgroups].Add(distbcurrentandnewclusterp);

            //update
            for (int row = 0; row < distanceMatrix.GetUpperBound(0); row++)
            {
                for (int col = 0; col < row; col++)
                {
                    // this is line is not correct 
                    distanceMatrix[col, row] = (distanceMatrix[col, positionofsmallestnumber.X] + distanceMatrix[col, positionofsmallestnumber.Y]) / 2;
                }
            }

            //reduce the matrix
            var newdistmatrix = reducearray(positionofsmallest.X, positionofsmallest.X, distanceMatrix);
            distanceMatrix = newdistmatrix;

            nofgroups--;
        }


        //1-get small element in distance matrix
    public int smallestnumber()
    {
        int smallest = int.MaxValue;
        //access distance matrix to get smallest distance 
        for (int row = 0; row < distanceMatrix.GetUpperBound(0); row++)
        {
            for (int col = 0; col < row; col++)
            {

                    if (smallest > distanceMatrix[col, row])
                    {
                        smallest = distanceMatrix[col, row];
                        //positionofsmallest = new Point(row, col);
                    }

            }
        } 

        return smallest;
    }
    //2-get the position of smallest element in distance matrix
    Point positionofsmallest ;
    public Point smallestnumposition(double smallestnum)
    {
        for (int row = 0; row < distanceMatrix.GetUpperBound(0); row++)
        {
            for (int col = 0; col < row; col++)
            {
                if (distanceMatrix[col, row] == smallestnum)
                {
                    positionofsmallest = new Point(col, row);
                    break;
                }
            }
        }
        return positionofsmallest;
    }

    //used function to update the matrix.
    public static int[,] reducearray(int rowToRemove, int columnToRemove, int[,] originalArray)
    {
        int[,] result = new int[originalArray.GetLength(0) - 1, originalArray.GetLength(1) - 1];

        for (int i = 0, j = 0; i < originalArray.GetLength(0); i++)
        {
            if (i == rowToRemove)
                continue;

            for (int k = 0, u = 0; k < originalArray.GetLength(1); k++)
            {
                if (k == columnToRemove)
                    continue;

                result[j, u] = originalArray[i, k];
                u++;
            }
            j++;
        }

        return result;
    }

什么是最好的数据结构可以用来减少矩阵的维度(更新它)并保持每个元素的索引正确!
或者如何使用任何其他编程语言实现此算法,然后在我的c#代码中调用此代码? 任何帮助!
提前致谢。

0 个答案:

没有答案