基于条件熊猫的子集数据帧

时间:2020-01-26 12:04:37

标签: pandas pandas-groupby

我有一个如下所示的数据框

Contract_ID    Unit_ID    Start_date     End_Date     Status
1              A          2014-05-01     2015-05-01   Closed
2              A          2016-05-01     2017-05-01   Expired
3              A          2018-05-01     2020-05-01   Active
4              B          2014-05-01     2015-05-01   Closed
5              B          2015-05-01     2016-05-01   Closed
6              C          2016-05-01     2017-05-01   Closed
7              C          2017-05-01     2018-05-01   Expired
8              D          2016-05-01     2017-05-01   Closed
9              D          2017-06-01     2018-05-01   Expired
10             D          2018-07-01     2020-08-01   Active

从上面我想找出没有激活状态的单位。

在上表中,单元A和D处于活动状态。

预期产量

Contract_ID    Unit_ID    Start_date     End_Date     Status
4              B          2014-05-01     2015-05-01   Closed
5              B          2015-05-01     2016-05-01   Closed
6              C          2016-05-01     2017-05-01   Closed
7              C          2017-05-01     2018-05-01   Expired

2 个答案:

答案 0 :(得分:2)

第一个想法是,如果每个组中没有GroupBy.transformGroupBy.all来过滤所有组中的值Active

df1 = df[df.assign(New=df['Status'].ne('Active')).groupby('Unit_ID')['New'].transform('all')]

或者首先用DataFrame.loc过滤至少一个Active的所有组,然后用没有Active组的倒置掩码组按Series.isin过滤:

df1 = df[~df['Unit_ID'].isin(df.loc[df['Status'].eq('Active'), 'Unit_ID'])]

print (df1)
   Contract_ID Unit_ID  Start_date    End_Date   Status
3            4       B  2014-05-01  2015-05-01   Closed
4            5       B  2015-05-01  2016-05-01   Closed
5            6       C  2016-05-01  2017-05-01   Closed
6            7       C  2017-05-01  2018-05-01  Expired

答案 1 :(得分:2)

使用pd.crosstabSeries.map的另一种方法

new_df = df[df['Unit_ID'].map(pd.crosstab(df['Unit_ID'],df['Status'])['Active'].eq(0))]

或带有GroupBy.transform

new_df = df[df['Status'].ne('Active').groupby(df['Unit_ID']).transform('all')]

输出

   Contract_ID Unit_ID  Start_date    End_Date   Status
3            4       B  2014-05-01  2015-05-01   Closed
4            5       B  2015-05-01  2016-05-01   Closed
5            6       C  2016-05-01  2017-05-01   Closed
6            7       C  2017-05-01  2018-05-01  Expired
相关问题