如何根据包含通配符的另一个列表筛选列表?

时间:2014-04-29 15:06:05

标签: python list glob

如何根据包含部分值和通配符的其他列表过滤列表?以下示例是我到目前为止的例子:

l1 = ['test1', 'test2', 'test3', 'test4', 'test5']
l2 = set(['*t1*', '*t4*'])

filtered = [x for x in l1 if x not in l2]
print filtered

此示例导致:

['test1', 'test2', 'test3', 'test4', 'test5']

但是,我希望将基于l2的结果限制为以下内容:

['test2', 'test3', 'test5']

3 个答案:

答案 0 :(得分:10)

使用fnmatch模块和列表理解any()

>>> from fnmatch import fnmatch
>>> l1 = ['test1', 'test2', 'test3', 'test4', 'test5']
>>> l2 = set(['*t1*', '*t4*'])
>>> [x for x in l1 if not any(fnmatch(x, p) for p in l2)]
['test2', 'test3', 'test5']

答案 1 :(得分:1)

你也可以使用filter()而不是list comprehension,这可能会让你轻松交换过滤函数以获得更大的灵活性:

>>> l1 = ['test1', 'test2', 'test3', 'test4', 'test5']
>>> l2 = set(['*t1*', '*t4*'])
>>> filterfunc = lambda item: not any(fnmatch(item, pattern) for pattern in l2)
>>> filter(filterfunc, l1)
Out: ['test2', 'test3', 'test5']
>>> # now we don't like our filter function no more, we assume that our l2 set should match on any partial match so we can get rid of the star signs:
>>> l2 = set(['t1', 't4'])
>>> filterfunc = lambda item: not any(pattern in item for pattern in l2)
>>> filter(filterfunc, l1)
Out: ['test2', 'test3', 'test5']

这样,您甚至可以将filterfunc概括为使用多个模式集:

>>> from functools import partial
>>> def filterfunc(item, patterns):
    return not any(pattern in item for pattern in patterns)
>>> filter(partial(filterfunc, patterns=l2), l1)
Out: ['test2', 'test3', 'test5']
>>> filter(partial(filterfunc, patterns={'t1','test5'}), l1)
Out: ['test2', 'test3', 'test4']

当然,您可以轻松升级filterfunc以接受模式集中的正则表达式。例如。

答案 2 :(得分:1)

我认为用例的最简单的方法是使用Python in来测试子字符串(虽然这意味着删除你的星号):

def remove_if_not_substring(l1, l2):
    return [i for i in l1 if not any(j in i for j in l2)]

所以这是我们的数据:

l1 = ['test1', 'test2', 'test3', 'test4', 'test5']
l2 = set(['t1', 't4'])

用它调用我们的函数:

remove_if_not_substring(l1, l2)

返回:

['test2', 'test3', 'test5']