这是我的代码:
mean= all_data.groupby(['Id'])[features].agg('mean').reset_index()
all_data = pd.merge(all_data, mean, suffixes=["", "_mean"], how='left', on=['Id'])
现在,我想向all_data框架添加另一列,如下所示:
meanDivide = all_data[features] / mean
all_data = pd.merge(all_data, meanDivide, suffixes=["", "_meanDivide"], how='left', on=['Id'])
我想将其在Id
上加入all_data。然后将Nan
inf
的值替换为大熊猫中的0。我几乎整天都在此上度过,但还是有问题。
编辑:我的all_data
看起来像这样:
Id第1行第2行 1 6 0 2 5 3 3 2 2 4 0 0 5 3 8
features
变量,如下所示:
features = ['Row1','Row2']
CSV格式的数据:
Id,Row1,Row2
1,6,0
2,5,3
3,2,2
4,0,0
5,3,8
答案 0 :(得分:1)
首先,您不需要merge
newdf=all_data.groupby(['Id'])[features].transform('mean')
newdf2=all_data[features]/newdf
pd.concat([all_data,newdf.add_suffix('_mean'),newdf2.add_suffix('_meanDivide')],axis=1)