多列分组时出现“ TypeError:无法将布尔转换为numpy.ndarray”

时间:2019-10-03 16:34:51

标签: python pandas dataframe pandas-groupby

我想按两列对数据框进行分组,以总结每个商店的平均每月销售额。

数据(fact熊猫数据框):

store_id    sku_id  date    quantity    city    city    category    month
0   354 31253   2017-08-08  1   Paris   Paris   Shirt   8
1   354 31253   2017-08-19  1   Paris   Paris   Shirt   8
2   354 31258   2017-07-30  1   Paris   Paris   Shirt   7
3   354 277171  2017-09-28  1   Paris   Paris   Shirt   9
4   174 295953  2017-08-16  1   London  London  Shirt   8

基于store_idmonth进行分组只能正常工作,但是当我尝试同时按store_idmonth进行分组时,我得到:

groupby_month = fact['quantity'].groupby(fact['store_id', 'month'])
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-169-a8cffb72ab7c> in <module>
----> 1 groupby_month = fact['quantity'].groupby(fact['store_id', 'month'])
      2 
      3 

D:\Anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
   2925             if self.columns.nlevels > 1:
   2926                 return self._getitem_multilevel(key)
-> 2927             indexer = self.columns.get_loc(key)
   2928             if is_integer(indexer):
   2929                 indexer = [indexer]

D:\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   2655                                  'backfill or nearest lookups')
   2656             try:
-> 2657                 return self._engine.get_loc(key)
   2658             except KeyError:
   2659                 return self._engine.get_loc(self._maybe_cast_indexer(key))

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine._get_loc_duplicates()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine._maybe_get_bool_indexer()

TypeError: Cannot convert bool to numpy.ndarray

2 个答案:

答案 0 :(得分:3)

首先检查索引标签和列

focus

如果您需要将索引转换为列,请使用:

使用:

Details 1

然后您可以使用:

Methods

输出:

fact.index 
fact.columns

或更好:

fact.reset_index()

答案 1 :(得分:0)

需要添加“as_index=True

例如: "count_in = df.groupby(['time_in','id'], as_index=True)['time_in'].count()"