Question

我正在尝试使用以下方法创建一个新的df作为现有df的子集：

filtered_df = df[((df.col == "Developing") | (df.col == "Ineffective") & (df.col_16 == "Developing") | (df.col_16== "Ineffective"))]

但这只会返回现有的df，而不会应用任何过滤。

我也试过了：

filtered_df = df[((df.col.astype(str) == "Developing") | (df.col.astype(str) == "Ineffective") & (df.col_16.astype(str) == "Developing") | (df.col_16.astype(str) == "Ineffective"))]

我还尝试分别为|和&切换or和and，但这会产生错误，基本上告诉我使用{{1 }或|。

返回相同的结果

我的数据通常如下所示：

所需输出是df的过滤版本，其中只满足我指定的条件（col和col_16都=“正在开发”或“无效”）。使用示例数据，只返回第二行。

Answer 1

看起来你缺少一组括号或语句分组的括号：

试试这个：

filtered_df = df[(((df['col'] == "Developing") | (df['col'] == "Ineffective")) & ((df['col_16'] == "Developing") | (df['col_16'] == "Ineffective")))]

Answer 2

您可以使用loc对数据进行切片。假设您的原始数据集与列出的一样，并存储为df，首先创建一个包含您要过滤的单词的列表。

content_to_filter_by = ['Developing','Ineffective']

new_df = df.loc[(df['col'].isin(content_to_filter_by))&(df['col_16'].isin(content_to_filter_by)),:].copy()

使用位于here的loc和其他DataFrame切片器的文档。

熊猫过滤不起作用

2 个答案: