在Python循环中更新和追加

时间:2013-10-22 17:58:27

标签: arrays loops python-2.7 dictionary numpy

我是Python的新手,我正在尝试开发一个代码,该代码应该基于一个名为Pycluster的预定义包来执行K-Means聚类。一开始我使用固定数量的集群(n = 10个集群)进行集群,代码运行正常。我试图稍微扩展一下代码,这样我就不再只生成10个集群,而是尝试建立一个循环,将所需的集群数量从2增加到10(或更多)。开始出现这些问题,因为正如我所说,我对Python完全不熟悉。 我开发的代码可以跟踪如下所示。我意识到关于代码行33到49的错误开始。 我非常感谢为使代码运行所提供的任何帮助。

# -*- coding: utf-8 -*-
"""
Created on Mon Oct 21 13:53:40 2013

@author: Engin
"""


from Pycluster import *
import numpy as np


#Open the text file containing the stored smart meter data
d=np.loadtxt("120-RES-195-Normalized.txt", delimiter="\t", skiprows=1, usecols=range(1,49))


handle=open("120-RES-195-Normalized.txt")  
record = read(handle) #Store the smart meter data in an array called record.

cluster_results = np.ones((120, 11))
cluster_centroids=np.array([])
within_cluster_sum_of_squares=np.ones((1,11))
between_cluster_sum_of_squares=np.ones((1,11))
distance=[]

for n in range (1,11):
    cluster_results[:,n-1], within_cluster_sum_of_squares[:,n-1], optimal_solution_repetition = record.kcluster(nclusters=n, npass=10, method='a', dist='e')     #Performs the K-Means clustering using the defined parameters
    centroids, cmask = record.clustercentroids(cluster_results[:,n-1], method='a', transpose=0) #Calculates the cluster centroids
    cluster_centroids=np.append(cluster_centroids,centroids)

#The following routine stores the cluster numbers and the indices of the elements belonging to each
#cluster so that the Between Clusters Sum of Squares would be easily calculated. The results will also
#be easily visualised.
    from collections import defaultdict
    cluster_numbers_members = defaultdict(list)
    for i,item in enumerate(cluster_results[:,n-1]):
        cluster_numbers_members[item].append(i)
    cluster_numbers_members = {k:v for k,v in cluster_numbers_members.items() if len(v)>=1}
    cluster_members=cluster_numbers_members.values()
    cluster_numbers=cluster_numbers_members.keys()

    distance[:,n-1]=0
    between_cluster_sum_of_squares[:,n-1]=0
    for i in range(0,n):
        for k in range(0,n):
            distance[:,n-1] = record.clusterdistance(index1=cluster_members[i], index2=cluster_members[k], method='a', dist='e', transpose=0)
            between_cluster_sum_of_squares[:,n-1]=between_cluster_sum_of_squares[:,n-1]+distance[:,n-1]

    WCBCR = within_cluster_sum_of_squares/between_cluster_sum_of_squares
    print cluster_results[:,n-1]
    print within_cluster_sum_of_squares[:,n-1]

print cluster_centroids

#Arranging cluster centroids in (1X48) vector form
cluster_tuple=zip(*[iter(cluster_centroids)]*48)
cluster_array=numpy.array(list(cluster_tuple))

1 个答案:

答案 0 :(得分:0)

替换

[:,n-1]

[:n-1]  or [:(n-1)]  # same thing, use whatever you find easier to read