Question

我在python大熊猫中有一个数据框，我根据以下条件拉了列

spike_cols = [col for col in nodes.columns if 'Num' in col]
print(spike_cols)

但是我正在寻找多个子字符串以检查列是否存在，我想提取所有与任何子字符串匹配的列

spike_cols = [col for col in nodes.columns if ('Num'|'Lice') in col]
    print(spike_cols)

但是我遇到了错误

: unsupported operand type(s) for |: 'str' and 'str'

Answer 1

您可以结合使用regex参数和DataFrame.filter：

# Create example dataframe
df = pd.DataFrame({'HelloNum': [1,2],
                  'World':[3,4],
                  'This':[5,6],
                  'ExampleLice':[7,8]})

print(df)

   HelloNum  World  This  ExampleLice
0         1      3     5            7
1         2      4     6            8

应用DataFrame.filter

print(df.filter(regex='Num|Lice'))
   HelloNum  ExampleLice
0         1            7
1         2            8

获取列表中的列名

df.filter(regex='Num|Lice').columns.tolist()

['HelloNum', 'ExampleLice']

Answer 2

您可以使用Series.str.contains：

df[df.columns[df.columns.str.contains(r'Num|Lice')]]

如果您只需要列名称本身：

df.columns[df.columns.str.contains(r'Num|Lice')].tolist()

Answer 3

尝试一下：

spike_cols = [col for col in nodes.columns if ('Num' in col or 'Lice' in col)]

如何检查多个子字符串以获取python中的列名？

3 个答案: