Question

我对Python和Pandas相当陌生。在Google和StackOverflow的帮助下，我可以获得了大部分帮助。但是，这让我难过。我有一个看起来像这样的数据框：

    SalesPerson 1       SalesPerson 2       SalesPerson 3   
    Revenue Number of Orders    Revenue Number of Orders    Revenue Number of Orders
In Process Stage 1  8347    8   9941    5   5105    7
In Process Stage 2  3879    2   3712    3   1350    10
In Process Stage 3  7885    4   6513    8   2218    2
Won Not Invoiced    4369    1   1736    5   4950    9
Won Invoiced    7169    5   5308    3   9832    2
Lost to Competitor  8780    1   3836    7   2851    3
Lost to No Action   2835    5   4653    1   1270    2

我想为“处理中”，“赢得”和“丢失”添加小计行，以使我的数据看起来像这样：

    SalesPerson 1       SalesPerson 2       SalesPerson 3   
    Revenue Number of Orders    Revenue Number of Orders    Revenue Number of Orders
In Process Stage 1  8347    8   9941    5   5105    7
In Process Stage 2  3879    2   3712    3   1350    10
In Process Stage 3  7885    4   6513    8   2218    2
In Process Subtotal 20111   14  20166   16  8673    19
Won Not Invoiced    4369    1   1736    5   4950    9
Won Invoiced    7169    5   5308    3   9832    2
Won Subtotal    11538   6   7044    8   14782   11
Won Percent 27% 23% 20% 25% 54% 31%
Lost to Competitor  8780    1   3836    7   2851    3
Lost to No Action   2835    5   4653    1   1270    2
Lost Subtotal   11615   6   8489    8   4121    5
Lost Percent    27% 23% 24% 25% 15% 14%
Total   43264   26  35699   32  27576   35

到目前为止，我的代码如下：

    def create_win_lose_table(dataframe):
    in_process_stagename_list = {'In Process Stage 1', 'In Process Stage 2', 'In Process Stage 3'}
    won_stagename_list = {'Won Invoiced', 'Won Not Invoiced'}
    lost_stagename_list = {'Lost to Competitor', 'Lost to No Action'}

    temp_Pipeline_df = dataframe.copy()
    for index, row in temp_Pipeline_df.iterrows():
        if index not in in_process_stagename_list:
            temp_Pipeline_df.drop([index], inplace = True)

    Pipeline_sum = temp_Pipeline_df.sum()
#at the end I was going to concat the sum to the original dataframe, but that's where I'm stuck

我才刚刚开始处理过程中的数据框。我的想法是，一旦弄清了这一点，我便可以为“成功与失败”类别重复该过程。欢迎任何想法或方法。

谢谢！乔恩

Answer 1

适合您的简单示例。

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.rand(5, 5))
df_total = pd.DataFrame(np.sum(df.iloc[:, :].values, axis=0)).transpose()
df_with_totals = df.append(df_total)

df_with_totals
          0         1         2         3         4
0  0.743746  0.668769  0.894739  0.947641  0.753029
1  0.236587  0.862352  0.329624  0.637625  0.288876
2  0.817637  0.250593  0.363517  0.572789  0.785234
3  0.140941  0.221746  0.673470  0.792831  0.170667
4  0.965435  0.836577  0.790037  0.996000  0.229857
0  2.904346  2.840037  3.051388  3.946885  2.227662

您可以在Pandas中使用重命名参数随意调用摘要行。

无法将小计插入熊猫数据框

1 个答案: