我已将数据框分组:
rwp_initial.df.loc[rwp_initial.df.sample_name=='sma_initial'].groupby(by=['sample_name','pH','salt','column'])['concentration'].plot(marker = 'o', rot=30)
并获得以下输出:
sample_name pH salt column
sma_initial 5.7 50 5 Axes(0.125,0.125;0.775x0.775)
6 Axes(0.125,0.125;0.775x0.775)
100 7 Axes(0.125,0.125;0.775x0.775)
8 Axes(0.125,0.125;0.775x0.775)
200 9 Axes(0.125,0.125;0.775x0.775)
10 Axes(0.125,0.125;0.775x0.775)
400 11 Axes(0.125,0.125;0.775x0.775)
12 Axes(0.125,0.125;0.775x0.775)
我想在每个pH值和盐浓度范围内取平均值。这些列只是测量两次的相同样品。如果我使用aggregate(np.mean)
,则计算一列的所有数据点的平均值。
这个数字可能会突出显示我想要取平均值的数据点(我希望沿着行平均):
rwp_initial.df.loc[rwp_initial.df.sample_name=='sma_initial'].groupby(by=['sample_name','pH','salt'])['concentration'].plot(marker = 'o', rot=30)
答案 0 :(得分:0)
好的,我找到了答案:
grp_initial = rwp_initial.df.loc[rwp_initial.df.sample_name=='sma_initial'].groupby(by=['sample_name','pH','salt']).concentration
for grp, val in grp_initial:
print(val.groupby(level='row').aggregate(np.mean))
作品