一种kmean聚类

时间:2015-04-21 10:41:53

标签: python module scalar

我尝试在python2.7中使用矩阵20 * 20运行此代码,我希望得到两个集群,就像kmean算法一样。

JS



import numpy as np

filename = np.genfromtxt('Matrix.txt')
M = np.sort (np.random.choice (2,20)) 

##m = np.copy(M) => I get an error there : 'module' object is not callable
M= m  #this option work better but i am not sure that it is appropriate

#initialization of the clusters
C = {}

for t in xrange(tmax=100):
	#determination of clusters
	J = np.mean(filename[:,M], axis = 1)
	for k in range (2):
		C[k] = np.where (J==k, 0,0) # np.where (J==k)=> another error for 'np.where': it take exactly three arguments but one given.I saw that it could take only one argument

	#update  
	for k in range (2):
		J = np.mean(filename[np.ix_(C[k],C[k])], axis = 1)
		j= np.argmin(J)
		m[k] = C[k][j] #[j]  => another error for '[j]': invalid index to scalar variable

    #results
print M, C

	




我的结果

  

{0:0,1:0}

预期结果

  

{0:8,1:12}

在示例中意味着群集中有8个元素' 0'群集中有12个' 1'。 这可能是因为' np.where'功能,但我不确定。

我运行程序时没有我之前提到的所有错误来获得此结果,但它不能正常工作

感谢您的帮助

1 个答案:

答案 0 :(得分:0)

另一种变体(它使用scikit库):

import numpy as np
from sklearn import cluster

n_clusters = 2

k_means = cluster.KMeans(n_clusters=n_clusters)
k_means.fit(filename)
values = k_means.cluster_centers_
labels = k_means.labels_

print values
print labels