Question

对于这种数据

    author        cat  val
0  author1  category2   15
1  author2  category4    9
2  author3  category1    7
3  author4  category1    9
4  author5  category2   11

我想要

      cat mean count
category2   13     2
category1    8     2
category4    9     1

我以为自己擅长熊猫并写了

most_expensive_standalone.groupby('cat').apply(['mean', 'count']).sort(['count', 'mean'])

但得到了

  File "/home/mike/anaconda/lib/python2.7/site-packages/pandas/core/groupby.py", line 3862, in _intercept_function
    return _func_table.get(func, fnc)
TypeError: unhashable type: 'list'

Answer 1

如果您只想将两个汇总函数.agg和.apply传递给数据，则应使用mean代替count。此外，由于您在同一列val上应用了两个函数，因此它将引入一个多级列索引。因此，在对新创建的列mean和count进行排序之前，您需要先选择其外层val。

most_expensive_standalone.groupby('cat').agg(['mean', 'count'])['val'].sort(['mean', 'count']


           mean  count
cat                   
category1     8      2
category4     9      1
category2    13      2

后续处理：

# just perform groupby and .agg will give you this
most_expensive_standalone.groupby('cat').agg(['mean', 'count'])

           val      
          mean count
cat                 
category1    8     2
category2   13     2
category4    9     1

选择val列

most_expensive_standalone.groupby('cat').agg(['mean', 'count'])['val']


           mean  count
cat                   
category1     8      2
category2    13      2
category4     9      1

最后致电.sort(['mean', 'count'])

使用Python中的Pandas，如何按agg函数创建的两列进行排序？

1 个答案: