Pandas DataFrame搜索列中可用列表中的所有字符串

时间:2014-03-13 06:10:29

标签: python pandas dataframe

在以下DataFrame中,我需要在' a'中搜索所有字符串。

df = pd.DataFrame({'id' : [1,2,3,4],
                'path'  : ["p1,p2,p3,p4","p1,p2,p1","p1,p5,p5,p7","p1,p2,p3,p3"]})

需要检查是否' p1'和' p2'可用。

a = ['p1','p2']

如下所示

if all(x in df.path for x in a):
    print df

1 个答案:

答案 0 :(得分:1)

这个怎么样?

import pandas as pd

df = pd.DataFrame({'id': [1,2,3,4],
       'path': ["p1,p2,p3,p4","p1,p2,p1","p1,p5,p5,p7","p1,p2,p3,p3"]})

a = [ 'p1', 'p2']

# see: http://stackoverflow.com/a/470602/1407427
reg_exp = ''.join(['(?=.*%s)' % (i) for i in a])

# alternatively: print df.path.str.match(reg_exp, as_indexer=True)
print df.path.str.contains(reg_exp)

结果:

0     True
1     True
2    False
3     True
Name: path, dtype: bool