将一行总计添加到数据框

时间:2018-02-23 16:08:59

标签: pandas sum append

我有一个数据框,我正在试图弄清楚如何为每个客户添加一行,总结每个客户的小时数。以下是我的数据框的示例:

                              hours
client           month

A               January       203.50
                February      227.75
                March         159.75
                April         203.25
                May           199.90

B               January        203.50
                February       227.75
                March         159.75
                April          203.25
                May            199.90

C               January       203.50
                February      227.75
                March          159.75
                April         203.25
                May           199.90

我想为每个总结小时数的客户添加一个新行。它看起来像这样:

                           hours
client           month

A               January    203.50
                February   227.75
                March      159.75
                April      203.25
                May        199.90
                Total     1000.34

B               January    203.50
                February   227.75
                March      159.75
                April      203.25
                May       199.90
                Total     1000.34

C               January   203.50
                February   227.75
                March      159.75
                April      203.25
                May       199.90
                Total     1000.34

我已经厌倦了写一个遍历每个客户端的for循环,总结了几个小时,然后将新行添加到每个客户端。我正在尝试的循环看起来像这样

在df中持续数小时:     df.append(pd.Series(vp.sum(),name =' Total'

然而,这不起作用。任何帮助将不胜感激!!

2 个答案:

答案 0 :(得分:2)

IIUC,您可以使用concat

pd.concat([df,df.sum(level=0).assign(month='Total').set_index('month',append=True)]).sort_index()
Out[1754]: 
                  hours
client month           
A      April     203.25
       February  227.75
       January   203.50
       March     159.75
       May       199.90
       Total     994.15
B      April     203.25
       February  227.75
       January   203.50
       March     159.75
       May       199.90
       Total     994.15
C      April     203.25
       February  227.75
       January   203.50
       March     159.75
       May       199.90
       Total     994.15

答案 1 :(得分:1)

您可以unstack然后添加列。

df.unstack().hours.assign(Total=lambda d: d.sum(1)).stack().to_frame('hours')

                  hours
client month           
A      April     203.25
       February  227.75
       January   203.50
       March     159.75
       May       199.90
       Total     994.15
B      April     203.25
       February  227.75
       January   203.50
       March     159.75
       May       199.90
       Total     994.15
C      April     203.25
       February  227.75
       January   203.50
       March     159.75
       May       199.90
       Total     994.15

保留原始行顺序的变体:

# get old order of months
months = pd.unique(df.index.get_level_values(1)).tolist()

# create total dataframe with to levels of index
totals = pd.DataFrame.from_dict({
    (client, 'Total'): {'hours': v}
    for client, v in df.hours.sum(level=0).items()
}, 'index')

# append and reindex
df.append(totals).sort_index().reindex(months + ['Total'], level=1)

                  hours
client month           
A      January   203.50
       February  227.75
       March     159.75
       April     203.25
       May       199.90
       Total     994.15
B      January   203.50
       February  227.75
       March     159.75
       April     203.25
       May       199.90
       Total     994.15
C      January   203.50
       February  227.75
       March     159.75
       April     203.25
       May       199.90
       Total     994.15