我试图获取我的k-means结果数据帧的熵,然后我收到错误:TypeError:' numpy.int32'对象不可迭代 我不明白为什么。
from collections import Counter
def calcEntropy(x):
p, lens = Counter(x), np.float(len(x))
return -np.sum(count/lens*np.log2(count/lens) for count in p.values())
k_means_sp['entropy']=[calcEntropy(x) for x in k_means_sp['cluster']]
然后我收到错误消息:
<ipython-input-26-d375ecf00330> in <module>()
----> 1 k_means_sp['entropy']=[calcEntropy(x) for x in k_means_sp['cluster']]
<ipython-input-26-d375ecf00330> in <listcomp>(.0)
----> 1 k_means_sp['entropy']=[calcEntropy(x) for x in k_means_sp['cluster']]
<ipython-input-23-f5508ea8782c> in calcEntropy(x)
1 from collections import Counter
2 def calcEntropy(x):
----> 3 p, lens = Counter(x), np.float(len(x))
4 return -np.sum(count/lens*np.log2(count/lens) for count in p.values())
/Users/mpiercy/anaconda/lib/python3.6/collections/__init__.py in __init__(*args, **kwds)
535 raise TypeError('expected at most 1 arguments, got %d' % len(args))
536 super(Counter, self).__init__()
--> 537 self.update(*args, **kwds)
538
539 def __missing__(self, key):
/Users/mpiercy/anaconda/lib/python3.6/collections/__init__.py in update(*args, **kwds)
622 super(Counter, self).update(iterable) # fast path when counter is empty
623 else:
--> 624 _count_elements(self, iterable)
625 if kwds:
626 self.update(kwds)
TypeError: 'numpy.int32' object is not iterable
k_means_sp.head()
credit debit cluster
0 9.207673 8.198884 1
1 4.248495 8.202181 0
2 8.149668 7.735145 2
3 5.138677 7.859741 0
4 8.058163 7.918614 2
答案 0 :(得分:0)
好的,这是第一次尝试。您的数据框看起来像是在'cluster'
列中存储了群集索引。所以你需要做的是根据索引获取每个集群,然后将该集群传递给calcEntropy
函数,例如
for i in xrange(len(k_means_sp['cluster'].unique())) # loop thru cluster indices:
cluster = k_means_sp.ix[k_means_sp['cluster'] == i][['credit', 'debit']]
entropy = calcEntropy(cluster)
第二行将行过滤为仅具有相同群集索引的行。这有帮助吗?