熊猫在处理nan值的groupby对象上变换nunique

时间:2019-02-20 14:53:20

标签: python pandas dataframe pandas-groupby

我有以下df

inv_id    cluster_id
793        2
           2
789        3
789        3
           4
           4

我喜欢groupby cluster_id并检查每个组有多少个唯一值,

df['same_inv_id'] = df.groupby('cluster_id')['inv_id'].transform('nunique') == 1  

但是当某个群集仅包含空/空白same_inv_id = False时,以及当某个群集包含一个或多个空/空白inv_id时,我喜欢设置inv_id,因此结果看起来像,

inv_id    cluster_id    same_inv_id
793        2            False 
           2            False
789        3            True
789        3            True
           4            False
           4            False 

1 个答案:

答案 0 :(得分:2)

IIUC获得条件,然后transform + all

s1=df.inv_id.ne('').groupby(df.cluster_id).transform('all')
s1
Out[432]: 
0    False
1    False
2     True
3     True
4    False
5    False
Name: inv_id, dtype: bool
s2=df.groupby('cluster_id')['inv_id'].transform('nunique') == 1 
#df['same_inv_id']=s1&s2
相关问题