我有兴趣让一些来自数据集的男性成员。我怎么让Python知道我想保留X,Y和Z男性并且让其他男性掉线?例如,假设我从这个数据帧开始:
import pandas as pd
df1 = pd.DataFrame({'Salary':[8700,6300,4700,2100,3400], 'Gender':['Male','Female','Male','Female','Male']},index=pd.Series(['Joe Smith', 'Jane Doe', 'Rob Dole', 'Sue Pam', 'Jack Li'], name='Name'))
print df1
Gender Salary
Name
Joe Smith Male 8700
Jane Doe Female 6300
Rob Dole Male 4700
Sue Pam Female 2100
Jack Li Male 3400
在数据框中的男性中,我想保留Joe Smith和Rob Dole并删除所有其他男性。使用性别标识符在数千个名称中执行此操作的最快方法是什么?我有一个大约20-25个名字的列表,我想保留在成千上万。我的最终数据框应如下所示:
Gender Salary
Name
Joe Smith Male 8700
Jane Doe Female 6300
Rob Dole Male 4700
Sue Pam Female 2100
答案 0 :(得分:1)
你的条件是:
cond=(df1.Gender=='Female') | (df1.index.isin(['Joe Smith','Rob Dole']))
,您只需df1[cond]
。
答案 1 :(得分:0)
或者,您可以使用.query()方法:
In [14]: df1.query("Gender in ['Female','Unknown'] or Name in ['Joe Smith','Rob Dole']")
Out[14]:
Gender Salary
Name
Joe Smith Male 8700
Jane Doe Female 6300
Rob Dole Male 4700
Sue Pam Female 2100