Pandas使用多个字段一起过滤行

时间:2016-06-30 14:11:45

标签: python pandas dataframe

我有一只像这样的大熊猫\d

DataFrame

我想使用以下元组过滤此In [34]: people = pandas.DataFrame({'name' : ['John', 'John', 'Mike', 'Sarah', 'Julie'], 'age' : [28, 18, 18, 2, 69]}) people = people[['name', 'age']] people Out[34]: name age 0 John 28 1 John 18 2 Mike 18 3 Sarah 2 4 Julie 69

DataFrame

输出应如下所示:

In [35]: filter = [('John', 28), ('Mike', 18)]

我试过这样做:

Out[35]: 
    name    age
0   John    28
2   Mike    18

然而,它向我显示了两个Johns,因为它独立地过滤每个列(两个Johns的年龄出现在In [34]: mask = k.isin({'name': ['John', 'Mike'], 'age': [28, 18]}).all(axis=1) k = k[mask] k 数组中)。

age

如何基于多个字段一起过滤行?

1 个答案:

答案 0 :(得分:4)

这应该有效:

people.set_index(people.columns.tolist(), drop=False).loc[filter].reset_index(drop=True)

清理并解释

# set_index with the columns you want to reference in tuples
cols = ['name', 'age']
people = people.set_index(cols, drop=False)
#                                   ^
#                                   |
#   ensure the cols stay in dataframe

#   does what you
#   want but now has
#   index that was
#   not there
# /--------------\
people.loc[filter].reset_index(drop=True)
#                 \---------------------/
#                  Gets rid of that index