Question

我有一个字符串列表，我想从中找到包含'http：//'的每一行，但没有'lulz'，'lmfao'，'。png'或任何其他项目在其中的字符串列表中。我该怎么做呢？

我的直觉告诉我使用正则表达式，但我对巫术有道德上的反对意见。

Answer 1

如果要排除的字符串列表很大，这是一个相当可扩展的选项：

exclude = ['lulz', 'lmfao', '.png']
filter_func = lambda s: 'http://' in s and not any(x in s for x in exclude)

matching_lines = filter(filter_func, string_list)

列表理解替代方案：

matching_lines = [line for line in string_list if filter_func(line)]

Answer 2

这几乎等同于F.J的解决方案，但使用generator expressions而不是lambda表达式和过滤函数：

haystack = ['http://blah', 'http://lulz', 'blah blah', 'http://lmfao']
exclude = ['lulz', 'lmfao', '.png']

http_strings = (s for s in haystack if s.startswith('http://'))
result_strings = (s for s in http_strings if not any(e in s for e in exclude))

print list(result_strings)

当我运行它时会打印：

['http://blah']

Answer 3

试试这个：

for s in strings:
    if 'http://' in s and not 'lulz' in s and not 'lmfao' in s and not '.png' in s:
        # found it
        pass

其他选项，如果您需要更灵活的选项：

words = ('lmfao', '.png', 'lulz')
for s in strings:
    if 'http://' in s and all(map(lambda x, y: x not in y, words, list(s * len(words))):
        # found it
        pass

如果string不包含python中的任何字符串列表

3 个答案: