熊猫数据框基于组替换列值

时间:2019-04-01 11:47:53

标签: python-3.x pandas pandas-groupby

我有一个具有以下结构的数据框,

case when sum( case when MessageId= 1 then 1 else 0 end) =1
then 0  else
count(DISTINCT MessageId) end as cnt
  • 如果某个组的“ uuid”列也不为空,即“ master_mac”和“ slave_mac”,则相应的行应包含“ rawData”列的NaN。

最终结果必须是

   master_mac    slave_mac        uuid           rawData               
0  ac233fc01403  ac233f26492b     e2c56db5       NaN                                                         
1  ac233fc01403  ac233f26492b     e2c56db5       NaN                                                         
2  ac233fc01403  ac233f26492b     e2c56db5       NaN                                                          
3  ac233fc01403  ac233f26492b     e2c56db5       ac0228  
4  ac233fc01403  e464eecba5eb     NaN            590080             
5  ac233fc01403  ac233f26492b     e2c56db5       ac0228  
6  ac233fc01403  ac233f26492b     e2c56db5       NaN                                                          
7  ac233fc01403  ac233f26492b     e2c56db5       636800       

有人可以帮我吗?

2 个答案:

答案 0 :(得分:2)

使用:

m = df['uuid'].notna()

如果每个组需要处理,请使用GroupBy.transformGroupBy.any来测试每个组至少一个非NaN

m = df['uuid'].notna().groupby([df['master_mac'],df['slave_mac']]).transform('any')

df['rawData'] = df['rawData'].mask(m)
print (df)
     master_mac     slave_mac      uuid rawData
0  ac233fc01403  ac233f26492b  e2c56db5     NaN
1  ac233fc01403  ac233f26492b  e2c56db5     NaN
2  ac233fc01403  ac233f26492b  e2c56db5     NaN
3  ac233fc01403  ac233f26492b  e2c56db5     NaN
4  ac233fc01403  e464eecba5eb       NaN  590080
5  ac233fc01403  ac233f26492b  e2c56db5     NaN
6  ac233fc01403  ac233f26492b  e2c56db5     NaN
7  ac233fc01403  ac233f26492b  e2c56db5     NaN

或者:

df.loc[m, 'rawData'] = np.nan

答案 1 :(得分:0)

如果您需要根据rawData列中的值为每一行修改uuid列中的值,只需执行以下操作:

df['rawData'].loc[df['uuid'].notna()] = np.nan