我在python pandas中的数据上运行.mean()但是只返回我正在分组的键和所需列的平均值。我想要所有的列,用平均值替换原始值。 尝试了多种东西,但似乎没有任何结果。以下是我用来生成资料的代码
dd1=df.groupby(['key']).agg({'sales':"mean"}).reset_index()
答案 0 :(得分:0)
dd1 = df.groupby('key').transform('mean')
Minimal and Complete Verifiable Example MCVE
df = pd.DataFrame(
np.random.randn(10, 4)
).add_prefix('C').assign(key=np.random.choice(list('AB'), 10))
print(df)
C0 C1 C2 C3 key
0 -0.239780 0.167832 0.879349 0.643696 A
1 0.517747 0.573424 -0.480853 -0.162014 A
2 0.236032 -0.396924 -1.406381 1.197946 A
3 0.479451 -0.790073 0.219239 -0.157358 B
4 -0.605864 -0.461622 -1.427521 -1.709760 B
5 -0.281919 -0.965817 1.256316 -1.351529 A
6 -2.085293 0.954725 -1.744391 -1.069667 A
7 -2.100504 -1.161964 -1.102306 0.547207 B
8 1.808283 -0.728799 -1.763971 -1.221539 B
9 -0.975264 0.958484 -0.458139 1.796640 B
现在transform
dd1 = df.groupby('key').transform('mean')
print(dd1)
C0 C1 C2 C3
0 -0.370642 0.066648 -0.299192 -0.148313
1 -0.370642 0.066648 -0.299192 -0.148313
2 -0.370642 0.066648 -0.299192 -0.148313
3 -0.278780 -0.436795 -0.906540 -0.148962
4 -0.278780 -0.436795 -0.906540 -0.148962
5 -0.370642 0.066648 -0.299192 -0.148313
6 -0.370642 0.066648 -0.299192 -0.148313
7 -0.278780 -0.436795 -0.906540 -0.148962
8 -0.278780 -0.436795 -0.906540 -0.148962
9 -0.278780 -0.436795 -0.906540 -0.148962
您可以通过以下方式将其限制为几列:
cols = ['C0', 'C1', 'C2', 'C3']
dd1 = df.groupby('key')[cols].transform('mean')
print(dd1)
C0 C1 C2 C3
0 -0.370642 0.066648 -0.299192 -0.148313
1 -0.370642 0.066648 -0.299192 -0.148313
2 -0.370642 0.066648 -0.299192 -0.148313
3 -0.278780 -0.436795 -0.906540 -0.148962
4 -0.278780 -0.436795 -0.906540 -0.148962
5 -0.370642 0.066648 -0.299192 -0.148313
6 -0.370642 0.066648 -0.299192 -0.148313
7 -0.278780 -0.436795 -0.906540 -0.148962
8 -0.278780 -0.436795 -0.906540 -0.148962
9 -0.278780 -0.436795 -0.906540 -0.148962