Question

我有一个看起来像这样的df：

Date    Value
2020    0
2020    100
2020    200
2020    300
2021    100
2021    150
2021    0

我想获得按Value分组的Date的平均值，其中Value > 0。当我尝试：

df['Yearly AVG'] = df[df['Value']>0].groupby('Date')['Value'].mean()

我得到NaN值，当我在上面的行中打印时，得到了我所需要的，但带有Date列。

Date
2020    200
2021    125

我如何拥有以下内容：

Date    Value    Yearly AVG
2020    0        200
2020    100      200 
2020    200      200
2020    300      200
2021    100      125
2021    150      125    
2021    0        125

Answer 1

这是技巧性的将不匹配的值替换为缺失的值，然后对汇总值填充的新列使用GroupBy.transform：

df['Yearly AVG'] = df['Value'].where(df['Value']>0).groupby(df['Date']).transform('mean')
print (df)
   Date  Value  Yearly AVG
0  2020      0       200.0
1  2020    100       200.0
2  2020    200       200.0
3  2020    300       200.0
4  2021    100       125.0
5  2021    150       125.0
6  2021      0       125.0

详细信息：

print (df['Value'].where(df['Value']>0))
0      NaN
1    100.0
2    200.0
3    300.0
4    100.0
5    150.0
6      NaN
Name: Value, dtype: float64

您的解决方案应更改：

df['Yearly AVG'] = df['Date'].map(df[df['Value']>0].groupby('Date')['Value'].mean())

如何将groupby结果分配给熊猫系列

1 个答案: