按数据框中每个组的布尔值出现次数进行过滤

时间:2018-09-11 16:37:45

标签: python pandas dataframe pandas-groupby

我正在处理一个包含带有布尔数据列的数据框。如下所示:

      beforeSend: function (xhr){ 
           xhr.setRequestHeader('Authorization', "Basic " + btoa("xxxx:xxxx")); 
       },

我需要删除那些在'match'列中显示超过2个False值的行,所有这些行均与A列有关。输出为:

       A  match
52     7   True
53     7   True
54     7   False
55     7   False
56     7   False
57     7   False
437    8   True
438    8   True
439    8   True
440    8   True
441    8   True
442    8   False
488    2   False
489    2   True
490    2   True

我试图对A列进行分组,然后计算False的数量,但是我被困在这里。有想法吗?

3 个答案:

答案 0 :(得分:2)

否定您的列groupby A,并使用transform

s= (~df.match).groupby(df.A).transform('sum')

接下来使用loc选择所需的值:

df.loc[s.le(2)]

     A  match 
437  8   True 
438  8   True 
439  8   True 
440  8   True 
441  8   True 
442  8  False 
488  2  False 
489  2   True 
490  2   True 

一行:

df.loc[(~df.match).groupby(df.A).transform('sum').le(2)]

答案 1 :(得分:2)

使用Sub Unpivot() Dim wks_source As Worksheet Dim wks_target As Worksheet Dim nRowsTarget As Long Dim nColumnsTarget As Long Dim nRowsSource As Long Dim i As Long Dim j As Long Dim k As Long Set wks_source = Worksheets("Source") Set wks_target = Worksheets("Target") nRowsTarget = 7 nColumnsTarget = 5 nRowsSource = 7 For i = 2 To nRowsTarget For j = 2 To nColumnsTarget For k = 2 To nRowsSource If wks_target.Cells(i, 1).Value = wks_source.Cells(k, 1).Value Then ' If ID matches go on If wks_target.Cells(1, j).Value = wks_source.Cells(k, 2).Value Then ' If some property matches go on wks_target.Cells(i, j).Value = wks_source.Cells(k, 3).Value ' Assign Exit For End If End If Next k Next j Next i End Sub

filter

答案 2 :(得分:1)

isingroupby sum一起使用

s=(~df['match']).groupby(df['A']).sum()<2
df.loc[df.A.isin(s[s].index)]
Out[92]: 
     A  match
437  8   True
438  8   True
439  8   True
440  8   True
441  8   True
442  8  False
488  2  False
489  2   True
490  2   True