Python TypeError:' numpy.int32'对象不可迭代

时间:2017-08-23 18:57:07

标签: python pandas typeerror ipython-notebook entropy

我试图获取我的k-means结果数据帧的熵,然后我收到错误:TypeError:' numpy.int32'对象不可迭代 我不明白为什么。

from collections import Counter 
def calcEntropy(x):
    p, lens = Counter(x), np.float(len(x))
    return -np.sum(count/lens*np.log2(count/lens) for count in p.values())
k_means_sp['entropy']=[calcEntropy(x) for x in k_means_sp['cluster']]

然后我收到错误消息:

<ipython-input-26-d375ecf00330> in <module>()
----> 1 k_means_sp['entropy']=[calcEntropy(x) for x in k_means_sp['cluster']]

<ipython-input-26-d375ecf00330> in <listcomp>(.0)
----> 1 k_means_sp['entropy']=[calcEntropy(x) for x in k_means_sp['cluster']]

<ipython-input-23-f5508ea8782c> in calcEntropy(x)
      1 from collections import Counter
      2 def calcEntropy(x):
----> 3     p, lens = Counter(x), np.float(len(x))
      4     return -np.sum(count/lens*np.log2(count/lens) for count in p.values())

/Users/mpiercy/anaconda/lib/python3.6/collections/__init__.py in __init__(*args, **kwds)
    535             raise TypeError('expected at most 1 arguments, got %d' % len(args))
    536         super(Counter, self).__init__()
--> 537         self.update(*args, **kwds)
    538 
    539     def __missing__(self, key):

/Users/mpiercy/anaconda/lib/python3.6/collections/__init__.py in update(*args, **kwds)
    622                     super(Counter, self).update(iterable) # fast path when counter is empty
    623             else:
--> 624                 _count_elements(self, iterable)
    625         if kwds:
    626             self.update(kwds)

TypeError: 'numpy.int32' object is not iterable

k_means_sp.head()

      credit    debit   cluster
0   9.207673    8.198884    1
1   4.248495    8.202181    0
2   8.149668    7.735145    2
3   5.138677    7.859741    0
4   8.058163    7.918614    2

1 个答案:

答案 0 :(得分:0)

好的,这是第一次尝试。您的数据框看起来像是在'cluster'列中存储了群集索引。所以你需要做的是根据索引获取每个集群,然后将该集群传递给calcEntropy函数,例如

for i in xrange(len(k_means_sp['cluster'].unique())) # loop thru cluster indices:
    cluster = k_means_sp.ix[k_means_sp['cluster'] == i][['credit', 'debit']]
    entropy = calcEntropy(cluster)

第二行将行过滤为仅具有相同群集索引的行。这有帮助吗?