Question

假设我有一个熊猫DataFrame df，如下所示，还有一个列表l，也如下所示。我想从df中选择任何列名都以l中的字符串开头的列。因此，在这种情况下，我想获取df[['word', 'hello1', 'hello2', 'hello3']]。有什么快速的方法可以做到这一点吗？我可以遍历列表的每个元素，但是对于较大的DataFrame可能要花费很多时间。

import pandas as pd
df = pd.DataFrame({
    'word': [13,4],
    'another': [1,4],
    'champ': [1,5],
    'hello1': [1,2],
    'hello2': [4,5],
    'hello3': [7,8]
})

l = ['word', 'hello']

#what I want to get:
   word  hello1 hello2  hello3
0   13      1      4    7
1   4       2      5    8

Answer 1

由于您希望列以给定的单词开头，因此可以：

df.loc[:, df.columns.str.match(f'^({"|".join(l)})')]

输出：

   word  hello1  hello2  hello3
0    13       1       4       7
1     4       2       5       8

Answer 2

尝试一下：

df.loc[:,df.columns.str.startswith(tuple(l))]

选择名称类似于列表中字符串的pandas数据框列

2 个答案: