熊猫不可动的类型:' numpy.ndarray'与熊猫groupby

时间:2017-08-18 10:08:20

标签: python-3.x pandas dataframe pandas-groupby

我可能是pandas DataFrame的基本问题。在以下代码段中,我插入了一个计算列' CAPACITY_CHECK'然后我尝试根据它分组数据。但是我一直有以下错误:TypeError:unhashable type:' numpy.ndarray'



TEMP['CAPACITY_CHECK'] = TEMP[['ADD_CAPACITY_ST', 'CAPACITY_ST', 'VOLUME_PER_SUPPLIER']].apply(lambda X: numpy.where(X[0]+X[1]<X[2],'Non OK', 'OK'), axis=1)
TEMP.groupby('CAPACITY_CHECK')['ID'].count()
&#13;
&#13;
&#13;

由于我没有尝试修改任何不可变对象,而新列的类型是&#34; Series&#34;,我不明白为什么我会遇到错误。

提前致谢

1 个答案:

答案 0 :(得分:2)

我认为您需要删除申请并仅使用numpy.where

mask = (TEMP['ADD_CAPACITY_ST'] + TEMP['CAPACITY_ST']) < TEMP['VOLUME_PER_SUPPLIER']
TEMP['CAPACITY_CHECK'] = numpy.where(mask,'Non OK', 'OK')

<强>示例

TEMP = pd.DataFrame({'ADD_CAPACITY_ST':[10,20,30],
                     'CAPACITY_ST':[10,20,30],
                     'VOLUME_PER_SUPPLIER':[40,20,100]})

mask = (TEMP['ADD_CAPACITY_ST'] + TEMP['CAPACITY_ST']) < TEMP['VOLUME_PER_SUPPLIER']
TEMP['CAPACITY_CHECK'] = numpy.where(mask,'Non OK', 'OK')
print (TEMP)
   ADD_CAPACITY_ST  CAPACITY_ST  VOLUME_PER_SUPPLIER CAPACITY_CHECK
0               10           10                   40         Non OK
1               20           20                   20             OK
2               30           30                  100         Non OK              

然后使用GroupBy.sizeGroupBy.count

TEMP.groupby('CAPACITY_CHECK')['ID'].size()

TEMP.groupby('CAPACITY_CHECK')['ID'].count()

Difference between count and size