使用Python中的Pandas,如何按agg函数创建的两列进行排序?

时间:2015-07-12 14:37:18

标签: python pandas

对于这种数据

    author        cat  val
0  author1  category2   15
1  author2  category4    9
2  author3  category1    7
3  author4  category1    9
4  author5  category2   11

我想要

      cat mean count
category2   13     2
category1    8     2
category4    9     1

我以为自己擅长熊猫并写了

most_expensive_standalone.groupby('cat').apply(['mean', 'count']).sort(['count', 'mean'])

但得到了

  File "/home/mike/anaconda/lib/python2.7/site-packages/pandas/core/groupby.py", line 3862, in _intercept_function
    return _func_table.get(func, fnc)
TypeError: unhashable type: 'list'

1 个答案:

答案 0 :(得分:2)

如果您只想将两个汇总函数.agg.apply传递给数据,则应使用mean代替count。此外,由于您在同一列val上应用了两个函数,因此它将引入一个多级列索引。因此,在对新创建的列meancount进行排序之前,您需要先选择其外层val

most_expensive_standalone.groupby('cat').agg(['mean', 'count'])['val'].sort(['mean', 'count']


           mean  count
cat                   
category1     8      2
category4     9      1
category2    13      2

后续处理:

# just perform groupby and .agg will give you this
most_expensive_standalone.groupby('cat').agg(['mean', 'count'])

           val      
          mean count
cat                 
category1    8     2
category2   13     2
category4    9     1

选择val

most_expensive_standalone.groupby('cat').agg(['mean', 'count'])['val']


           mean  count
cat                   
category1     8      2
category2    13      2
category4     9      1

最后致电.sort(['mean', 'count'])