在分组数据框过滤器上返回NaN

时间:2018-06-20 17:41:50

标签: python pandas dataframe filter

所以有人可以阐明为什么我得到以下内容的“ NaN”:

这是我的数据框:

  df2 = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',
                          'foo', 'bar', 'foo', 'foo', 'jack'],
                      'B' : ['one', 'one', 'two', 'three',
                       'two', 'two', 'one', 'three', 'four']})

然后我按“ A”列分组

df3 = df2.groupby('A')
for A, group in df3:
    print (A)
    print (group)

结果:

bar
     A      B
1  bar    one
3  bar  three
5  bar    two
foo
     A      B
0  foo    one
2  foo    two
4  foo    two
6  foo    one
7  foo  three
jack
      A     B
8  jack  four

到目前为止一切都很好,所以我要返回的是分组的集合,其中列“ B”包含“一个”或“两个”:

df4 = df3.apply (lambda x: (x[x['B'] == 'one']) | (x[x['B'] == 'two']))

我得到的结果是:

        A   B
A           
bar 1   NaN NaN
    5   NaN NaN
foo 0   NaN NaN
    2   NaN NaN
    4   NaN NaN

2 个答案:

答案 0 :(得分:1)

为什么不事先过滤掉?

pd.concat({k : g for k, g in df2[df2.B.isin(['one', 'two'])].groupby('A')})

         A    B
bar 1  bar  one
    5  bar  two
foo 0  foo  one
    2  foo  two
    4  foo  two
    6  foo  one

如果您只是想获得单独的组而不将它们连接在一起,请在

停下来
groups = {k : g for k, g in df2[df2.B.isin(['one', 'two'])].groupby('A')}

通过groups['bar']groups['foo']访问每个组的地方。

答案 1 :(得分:0)

另一种方法是使用groupbyapply

df2.groupby('A').apply(lambda x: x[x['B'].isin(['one','two'])])

         A    B
A              
bar 1  bar  one
    5  bar  two
foo 0  foo  one
    2  foo  two
    4  foo  two
    6  foo  one