使用轴= 1

时间:2017-08-02 21:02:28

标签: python pandas

使用多个函数在列(轴= 1)之间聚合数据框的最佳方法是什么?

应用函数列表按预期工作,默认轴= 0:

In [7]: tsdf = pd.DataFrame(np.random.randn(2, 3), columns=['A', 'B', 'C'],
                            index=pd.date_range('1/1/2000', periods=2))
   ...: tsdf

Out[7]:
                   A         B         C
2000-01-01 -0.496619  0.282351  0.222707
2000-01-02  1.185002 -0.988669 -2.300515

In [8]: tsdf.agg(['min', 'max', 'mean'])
Out[8]:
             A         B         C
min  -0.496619 -0.988669 -2.300515
max   1.185002  0.282351  0.222707
mean  0.344191 -0.353159 -1.038904

但使用axis = 1时失败:

In [9]: tsdf.agg(['min', 'max', 'mean'], axis=1)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-9-ad4197b17943> in <module>()
----> 1 tsdf.agg(['min', 'max', 'mean'], axis=1)

c:\python34\lib\site-packages\pandas\core\frame.py in aggregate(self, func, axis, *args, **kwargs)
   4152                 pass
   4153         if result is None:
-> 4154             return self.apply(func, axis=axis, args=args, **kwargs)
   4155         return result
   4156

c:\python34\lib\site-packages\pandas\core\frame.py in apply(self, func, axis, broadcast, raw, reduce, args, **kwds)
   4260                         f, axis,
   4261                         reduce=reduce,
-> 4262                         ignore_failures=ignore_failures)
   4263             else:
   4264                 return self._apply_broadcast(f, axis)

c:\python34\lib\site-packages\pandas\core\frame.py in _apply_standard(self, func, axis, ignore_failures, reduce)
   4356             try:
   4357                 for i, v in enumerate(series_gen):
-> 4358                     results[i] = func(v)
   4359                     keys.append(v.name)
   4360             except Exception as e:

TypeError: ("'list' object is not callable", 'occurred at index 2000-01-01 00:00:00')

我有什么遗漏的吗?我(天真地)认为轴的对称性或多或少。使用axis = 1?

应用多个聚合函数的最佳方法是什么?

谢谢, 亚历

1 个答案:

答案 0 :(得分:2)

我认为这是Pandas-Dev GitHub上列出的错误:

但是,有一种解决方法:

tsdf.T.agg(['min','max','mean']).T

输出:

                 min       max      mean
2000-01-01  0.187605  1.707985  0.874033
2000-01-02 -1.156725  1.121996 -0.009986