Question

看起来很难看：

df_cut = df_new[
             (
             (df_new['l_ext']==31) |
             (df_new['l_ext']==22) |
             (df_new['l_ext']==30) |
             (df_new['l_ext']==25) |
             (df_new['l_ext']==64)
             )
            ]

不起作用：

df_cut = df_new[(df_new['l_ext'] in [31, 22, 30, 25, 64])]

是否有上述“问题”的优雅且有效的解决方案？

Answer 1

使用isin

df_new[df_new['l_ext'].isin([31, 22, 30, 25, 64])]

Answer 2

您可以使用pd.DataFrame.query：

select_values = [31, 22, 30, 25, 64]
df_cut = df_new.query('l_ext in @select_values')

在后台使用顶级pd.eval函数。

Answer 3

概述：如何使用group by获取列表，然后使用isin将列表用作子查询？假设您使用数据框分组依据创建了一个列表，并且您想查看与iucr代码匹配的描述。首先为前40个iucr代码创建一个groupby列表，然后使用带有groupby索引的isin访问iucr代码。将结果的正确或错误列表应用于数据框，并获得唯一的描述列表。

plt.figure(figsize=(22,6))
iucr=df_sas.groupby(['IUCR'])['Arrest'].sum().nlargest(40).sort_values(ascending=False)
iucr.plot.bar()
plt.show()

arrest_descriptions=df_sas[df_sas['IUCR'].isin(iucr.index)]['Description'].unique()
arrest_descriptions=np.sort(arrest_descriptions)
for item in arrest_descriptions:
    print(item)

如何从pandas数据框中选择一个值是否在列表中？

3 个答案: