使用pandas.pivot_table在不提供分组的情况下计算整个表的聚合函数的最佳方法是什么?
例如,如果我想将A,B,C的总和计算到一个具有单行的表中,而不用任何columsn进行分组:
>>> x = pd.DataFrame({'A':[1,2,3],'B':[8,7,6],'C':[0,3,2]})
>>> x
A B C
0 1 8 0
1 2 7 3
2 3 6 2
>>> x.pivot_table(values=['A','B','C'],aggfunc=np.sum)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/tool/pandora64/.package/python-2.7.5/lib/python2.7/site-packages/pandas/tools/pivot.py", line 103, in pivot_table
grouped = data.groupby(keys)
File "/tool/pandora64/.package/python-2.7.5/lib/python2.7/site-packages/pandas/core/generic.py", line 2434, in groupby
sort=sort, group_keys=group_keys, squeeze=squeeze)
File "/tool/pandora64/.package/python-2.7.5/lib/python2.7/site-packages/pandas/core/groupby.py", line 789, in groupby
return klass(obj, by, **kwds)
File "/tool/pandora64/.package/python-2.7.5/lib/python2.7/site-packages/pandas/core/groupby.py", line 238, in __init__
level=level, sort=sort)
File "/tool/pandora64/.package/python-2.7.5/lib/python2.7/site-packages/pandas/core/groupby.py", line 1622, in _get_grouper
raise ValueError('No group keys passed!')
ValueError: No group keys passed!
另外,我想使用自定义aggfunc,上面的np.sum只是一个例子。
感谢。
答案 0 :(得分:0)
我想您正在询问如何将函数应用于数据框的所有列:要执行此操作,请调用数据框的apply
方法:
def myfunc(col):
return np.sum(col)
x.apply(myfunc)
Out[1]:
A 6
B 21
C 5
dtype: int64