Pandas:使用~mask根据多个条件从数据中过滤行

时间:2016-07-29 19:01:26

标签: regex pandas

我正在尝试从我的数据中过滤掉行:

 cid    date    catcode     amtsum
145403  N00000286   2009    F1100   0.500
199228  N00000286   2009    Z5100   4.000
485489  N00000286   2007    B4000   3.300
485547  N00000286   2007    F5100   5.000
488556  N00000286   2007    E4100   2.500
490622  N00000286   2007    F1400   5.000
490924  N00000286   2007    T3100   1.000
490957  N00000286   2007    K1200   5.000
495039  N00000286   2007    Z5300   0.051
496078  N00000286   2008    K1000   13.100

以下是我的一些代码:

#This is data for Barack Obama that I do not want in my data frame. The 'cid' code identifies Obama, I want to remove Obama for the years specified by 'date'.
mask = (campaign_contributions['cid'] == 'N00009638') & (campaign_contributions['date'] >= 2007) 
campaign_contributions = campaign_contributions[~mask]

#This is data for John McCain that I do not want in my data frame. The 'cid' code identifies McCain, I want to remove McCain for the years specified by 'date'.
mask1 = (campaign_contributions['cid'] == 'N00006424') & (campaign_contributions['date'] == 2008) & (campaign_contributions['date'] == 2007) 
campaign_contributions = campaign_contributions[~mask1]

#This is data for Bob Barr that I do not want in my data frame. The 'cid' code identifies Barr, I want to remove Barr for the years specified by 'date'.
mask2 = (campaign_contributions['cid'] == 'N00002526') & (campaign_contributions['date'] == 2008) & (campaign_contributions['date'] == 2007) 
campaign_contributions = campaign_contributions[~mask2]

#This is data for Ralph Nader that I do not want in my data frame.The 'cid' code identifies Nader, I want to remove Nader for the years specified by 'date'.
mask3 = (campaign_contributions['cid'] == 'N00000086') & (campaign_contributions['date'] == 2008) & (campaign_contributions['date'] == 2007)
campaign_contributions = campaign_contributions[~mask3]

上面的代码表示我想要过滤的行。我认为我正在使用〜mask工具不正确。理想情况下,我的最终项目将是没有上面指定行的数据框,即我不希望在我的数据框中显示此信息:

有人可以引导我朝着正确的方向前进吗?

1 个答案:

答案 0 :(得分:4)

您可以使用按位和运算符&来组合蒙版。它可能看起来像这样:

campaign_contributions = campaign_contributions[~mask & ~mask1 & ~mask2 & ~mask3]

或者您也可以使用或运算符|来执行:

campaign_contributions = campaign_contributions[~(mask | mask1 | mask2 | mask3)]

您可以在this post中找到更多信息。