如何在Pandas中执行groupby并计算原始数据集中每行的均值

时间:2019-02-08 11:03:39

标签: python pandas dataframe pandas-groupby

我有一个电子表格,其中的数据格式如下:

$(SRCROOT)/Pods/Headers/Public recursive

当行按$(SRCROOT)/Pods/Firebase recursive Result 列分组时,我想将上述数据集转换为以下内容,以显示Cost列的Brand | Model | Year | Cost | Tax -------------------------------------- Apple | iPhone 7 | 2017 | $1000 | $100 Apple | iphone 7 | 2018 | $800 | $80 Xiomi | Note 5 | 2017 | $300 | $30 Xiomi | Note 5 | 2018 | $200 | $20 Mean['Brand', 'Model']列值的总和:

Mean

我一直在尝试使用groupby函数,但是没有办法如上所述获得期望的结果。

期待您的回复。谢谢。

1 个答案:

答案 0 :(得分:1)

首先用replace将值转换为整数,再用transform得到mean,然后sum,最后在必要时转换回字符串:

cols = ['Cost','Tax']
df[cols] = df[cols].replace('\$','', regex=True).astype(int)
df['Mean'] = df.groupby(['Brand', 'Model'])['Cost'].transform('mean')

df['Result'] = df[['Mean','Tax']].sum(axis=1)
print (df)
   Brand     Model  Year  Cost  Tax  Mean  Result
0  Apple  iPhone 7  2017  1000  100  1000    1100
1  Apple  iphone 7  2018   800   80   800     880
2  Xiomi    Note 5  2017   300   30   250     280
3  Xiomi    Note 5  2018   200   20   250     270

然后:

cols1 = cols + ['Result', 'Mean']
df[cols1] = '$' + df[cols1].astype(str)
print (df)
   Brand     Model  Year   Cost   Tax   Mean Result
0  Apple  iPhone 7  2017  $1000  $100  $1000  $1100
1  Apple  iphone 7  2018   $800   $80   $800   $880
2  Xiomi    Note 5  2017   $300   $30   $250   $280
3  Xiomi    Note 5  2018   $200   $20   $250   $270