Python中的变量范围

时间:2014-07-03 21:44:45

标签: python variables cluster-analysis k-means scoping

目前,我正在编写一个简单的Python程序来进行k-medians聚类,但是我遇到了一个与变量作用域有关的问题。

这是我的聚类方法

class Cluster(object):
    center = None
    points = []

    def __init__(self, center):
        super(Cluster, self).__init__()
        self.center = center


def manhattan(row_a, row_b):
    dimensions = len(row_a)
    manhattan_dist = 0

    for i in range(0, dimensions):
        manhattan_dist = manhattan_dist + np.abs(float(row_a[i]) - float(row_b[i]))

    return manhattan_dist

def cluster(dataset, cluster_centers):
    clusters = []
    for cluster_center in cluster_centers:
        clusters.append(Cluster(center = cluster_center))

    for point in dataset:
        last_dist = np.inf
        last_cluster = None

        for cluster in clusters:
            dist = manhattan(point, cluster.center)
            if(dist != 0):
                if (dist < last_dist):
                    print str(dist) + " " + str(last_dist)
                    last_dist = dist
                    last_cluster = cluster


        last_cluster.points.append(point)


    return clusters

结果=簇([[1,1],[1,2],[1,3],[7,2],[8,3],[7,1]],[[2,2] ],[6,6]])

-

result = cluster([[1,1], [1,2], [1,3], [7,2], [8,3], [7,1]], [[2,2], [6,6]])

这是我得到的输出

enter image description here

问题在于,我将问题分配给变量&#34; last_dist&#34;并且可能&#34; last_cluster&#34;在for循环的簇内部,根据输出中可以看到的内容,值似乎根本没有更新,除了在返回到它之前它的值为7的单次迭代之外原始值&#34; Inf&#34;再次。这是什么原因,我该怎么办呢?谢谢

2 个答案:

答案 0 :(得分:0)

您还期望发生什么?这是你的代码:

for point in dataset:
    last_dist = np.inf # this line is executed 6 times
    last_cluster = None 

    for cluster in clusters:
        ...

clusters中只有2个项目,dataset中只有6个项目。因此,对于每个点(6次),last_distinf开头。输出中有6个inf,因此按预期工作。对于第二个群集,last_dist仅在满足您的条件if (dist < last_dist)时才会打印。看起来它只执行一次,这就是为什么你得到7.0而不是inf。也许你有manhattan()的错误?

因为

答案 1 :(得分:0)

您的代码似乎没有任何问题。您正在尝试找到每个点最近的群集。您可能会感到困惑的原因是因为您在更改为左侧值之前在last_dist中打印这些inf ...