我试图为数据制作子总计和总计。但有些地方我卡住了,无法取得我应得的成果。你能帮忙吗?
data.groupby(['Column4', 'Column5'])['Column1'].count()
当前输出:
Column4 Column5
2018-05-19 Duplicate 220
Informative 3
2018-05-20 Actionable 5
Duplicate 270
Informative 859
Non-actionable 2
2018-05-21 Actionable 8
Duplicate 295
Informative 17
2018-05-22 Actionable 10
Duplicate 424
Informative 36
2018-05-23 Actionable 8
Duplicate 157
Informative 3
2018-05-24 Actionable 5
Duplicate 78
Informative 3
2018-05-25 Actionable 3
Duplicate 80
预期产出:
Row Labels Actionable Duplicate Informative Non-actionable Grand Total
5/19/2018 219 3 222
5/20/2018 5 270 859 2 1136
5/21/2018 8 295 17 320
5/22/2018 10 424 36 470
5/23/2018 8 157 3 168
5/24/2018 5 78 3 86
5/25/2018 3 80 83
Grand Total 39 1523 921 2 2485
这是一个示例数据。请问之前请你看一下。我得到的错误很少。可能是我没有给出正确的数据。请检查一次。 Column1 Column2 Column3 Column4 Column5 Column6 BI Account Subject1 2:12 PM 5/19/2018重复名称1 PI Account Subject2 1:58 PM 5/19/2018可操作名称2 AI Account Subject3 5:01 PM 5/19/2018 Non-Actionable Name3 BI Account Subject4 5:57 PM 5/19/2018信息名称4 PI Account Subject5下午6:59 5/19/2018重复名称5 AI Account Subject6 8:07 PM 5/19/2018 Actionable Name1
答案 0 :(得分:1)
您可以使用pivot
从当前输出获取所需的输出,然后sum
计算所需的总数。
import pandas as pd
df = df.reset_index().pivot('index', values='Column5', columns='Column4')
# Add grand total columns, summing across all other columns
df['Grand Total'] = df.sum(axis=1)
df.columns.name = None
df.index.name = None
# Add the grand total row, summing all values in a column
df.loc['Grand Total', :] = df.sum()
df
现在是:
Actionable Duplicate Informative Non-actionable Grand Total
2018-05-19 NaN 220.0 3.0 NaN 223.0
2018-05-20 5.0 270.0 859.0 2.0 1136.0
2018-05-21 8.0 295.0 17.0 NaN 320.0
2018-05-22 10.0 424.0 36.0 NaN 470.0
2018-05-23 8.0 157.0 3.0 NaN 168.0
2018-05-24 5.0 78.0 3.0 NaN 86.0
2018-05-25 3.0 80.0 NaN NaN 83.0
Grand Total 39.0 1524.0 921.0 2.0 2486.0
答案 1 :(得分:1)
只需使用crosstab
pd.crosstab(df['Column4'], df['Column5'], margins = True, margins_name = 'Grand Total' )
答案 2 :(得分:0)
看看这个: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.pivot.html
你需要转动你的桌子:
df.reset_index().pivot(index='date', columns='Column4', values='Column5')