从数据框中选择子集,其中包含给定列表中的元素

时间:2018-03-24 07:47:16

标签: python pandas

我有数据框df并列出这样的成员:

df = pd.DataFrame({'a':[[20,21],[22,19],[30,27]], 'b':[22,13,7]})
members = [[20,21],[18,21],[15,18]]

我想从df1中选择子集df,以便在列表成员中使用列'a'的值。

在给定的情况下,我想得到这样的输出:

df1 = pd.DataFrame({'a':[[20,21]], 'b':[22]})

2 个答案:

答案 0 :(得分:1)

使用isin

In [523]: df[df['a'].isin(members)]
Out[523]:
          a   b
0  [20, 21]  22

或者,query

In [530]: df.query('a in @members')
Out[530]:
          a   b
0  [20, 21]  22

或者,apply in

In [524]: df[df['a'].apply(lambda x: x in members)]
Out[524]:
          a   b
0  [20, 21]  22

或者,list comprehension

In [536]: df[[x in members for x in df['a']]]
Out[536]:
          a   b
0  [20, 21]  22

详细

In [525]: df
Out[525]:
          a   b
0  [20, 21]  22
1  [22, 19]  13
2  [30, 27]   7

In [526]: members
Out[526]: [[20, 21], [18, 21], [15, 18]]

In [527]: pd.__version__
Out[527]: '0.23.0.dev0+60.ge09189e'

答案 1 :(得分:0)

一种可能的方法是将值转换为元组,并使用boolean indexingisin进行过滤:

df = df[df['a'].apply(tuple).isin([tuple(x) for x in members])]
print (df)
          a   b
0  [20, 21]  22