Question

我有一个值列表，我想使用这些值来选择数据框中的行。诀窍是我想选择列表值在该行中的任何行。示例：

index    color    shape
 1       blue     star
 2       red      square
 3       yellow   circle

我的清单应该是

list_vals = ['sq', 'blu']

我想选择行

index    color   shape
1        blue    star
2        red     square

Answer 1

使用DataFrame.stack转换为Series，然后使用Series.str.contains查找您感兴趣的字符串-我们将使用'|'.join创建正则表达式“ OR”组合来自list_items的所有项目。

作为参考，这种正则表达式在这种情况下看起来像'sq|blu'。

接下来，Series.unstack返回到原始形状，并在轴1上使用DataFrame.any创建布尔索引，我们将使用该布尔索引返回所需的行。

df[df.stack().str.contains('|'.join(list_vals)).unstack().any(1)]

[出]

   ndex color   shape
0     1  blue    star
1     2   red  square

Answer 2

这是一种方法

df_filtered = (
    df[(df['color'].str.contains(list_vals[0])) |
        (df['shape'].str.contains(list_vals[1]))
        ]
                )

print(df_filtered)
   index color   shape
0      1  blue    star
1      2   red  square

编辑

另一种方法基于with due protection considerations from other processes（包含此方法的完整说明）

我所做的唯一更改是（1）将搜索列表加入单个搜索字符串中，以及（2）返回搜索（过滤后）结果的DataFrame（行）索引（然后使用该索引）切片原始的DataFrame）

def find_subtext(df, txt):
    contains = df.stack().str.contains(txt).unstack()
    return contains[contains.any(1)].index
df_filtered = find_subtext(df, '|'.join(list_vals))

print(df.iloc[df_filtered, :])
   index color   shape
0      1  blue    star
1      2   red  square

Answer 3

from pandas import DataFrame
for index, row in df.iterrows(): 
    res = cur.execute(row["SQL_Query"]) 
    df['Results'] = DataFrame(cur.fetchall())

输出

df[df['shape'].apply(lambda x: any(s in x[:len(s)] for s in list_vals))]

Answer 4

或通过管道加入列表，并在df上与str.contains()进行检查：

df[df.apply(lambda x: x.str.contains('|'.join(list_vals))).any(axis=1)]

       color   shape
index              
1      blue    star
2       red  square

使用值列表从数据框中选择行

4 个答案: