熊猫透视表嵌套Aggfunc

时间:2019-10-08 15:47:52

标签: python pandas pivot-table

我正在尝试创建一个数据透视表,该数据表计算形式的数量以及该计数的总和,均值和中位数。但是,Forms dtype是分类的,我不能在非数字值上使用均值和中值函数。

我想使用“表单类型”:count作为要汇总的值。

如果将均值包含在第一个aggfunc中,则会出现此错误

DataError: No numeric types to aggregate

在创建数据透视表时,是否可以在数据透视表中嵌套agfuncts或更改dtype?

我尝试使用.astype(int),但似乎无法弄清楚该函数的语法。

下面的虚拟代码并不完全准确,但是我可以继续努力。

df = pd.DataFrame({"A": ["foo", "foo", "foo", "foo", "foo", "bar", "bar", "bar", "bar"],
                   "B": ["one", "one", "one", "two", "two", "one", "one", "two", "two"],
                   "C": ["105319", "1271075", "84565", "84354", "54835", "81638", "1282224", "41856", "78987"],
                   "Form Type": ["144", "D", "D/A", "144", "D", "D", "D", "S-1","D"]})
table = pd.pivot_table(df, columns = ['Form Type'],
                       index=['A', 'B', 'C'],
                       fill_value =' ', 
                       aggfunc={'Form Type': ['count']})

输出看起来与此类似

enter image description here

1 个答案:

答案 0 :(得分:0)

使用.value_counts()来计数Form Type

  • 这假设您要查询Form Type的计数,然后要mean的{​​{1}},mediansum
counts

替代:

  • 如果您想基于import pandas as pd df = pd.DataFrame({"A": ["foo", "foo", "foo", "foo", "foo", "bar", "bar", "bar", "bar"], "B": ["one", "one", "one", "two", "two", "one", "one", "two", "two"], "C": ["105319", "1271075", "84565", "84354", "54835", "81638", "1282224", "41856", "78987"], "Form Type": ["144", "D", "D/A", "144", "D", "D", "D", "S-1","D"]}) counts = df['Form Type'].value_counts().rename_axis('Form Type').reset_index(name='counts') Form Type counts D 5 144 2 S-1 1 D/A 1 counts['counts'].agg(['mean', 'median', 'sum']) # Output: mean 2.25 median 1.50 sum 9.00 Name: counts, dtype: float64 获得.value_counts
groupby

enter image description here