Question

首先，对不起，如果标题不明确，我很难正确地制定标题。这也是为什么我还没有找到问题，如果问题已经被提出，如果有的话。

所以，我有一个字符串列表，我想执行一个＆＃34;程序＆＃34;搜索用任何可能的子字符串替换target-substring中的每个* 这是一个例子：

strList = ['obj_1_mesh',
           'obj_2_mesh',
           'obj_TMP',
           'mesh_1_TMP',
           'mesh_2_TMP',
           'meshTMP']

searchFor('mesh_*')
# should return: ['mesh_1_TMP', 'mesh_2_TMP']

在这种只有一个*的情况下，我只使用*拆分每个字符串并使用startswith()和/或endswith()，这样就可以了。但如果搜索字符串中有多个*，我就不知道如何做同样的事情。

所以我的问题是，如何在字符串列表中搜索任意数量的未知子字符串代替*？
例如：

strList = ['obj_1_mesh',
           'obj_2_mesh',
           'obj_TMP',
           'mesh_1_TMP',
           'mesh_2_TMP',
           'meshTMP']

searchFor('*_1_*')
# should return: ['obj_1_mesh', 'mesh_1_TMP']

希望一切都清楚。感谢。

Answer 1

考虑使用提供类Unix文件模式匹配的'fnmatch'。更多信息http://docs.python.org/2/library/fnmatch.html

from fnmatch import fnmatch
strList = ['obj_1_mesh',
       'obj_2_mesh',
       'obj_TMP',
       'mesh_1_TMP',
       'mesh_2_TMP',
       'meshTMP']

searchFor = '*_1_*'

resultSubList = [ strList[i] for i,x in enumerate(strList) if fnmatch(x,searchFor) ]

这应该可以解决问题

Answer 2

如果我是你，我会使用正则表达式包。您将需要学习一些正则表达式才能进行正确的搜索查询，但这并不算太糟糕。在这种情况下，“。+”非常类似于“*”。

import re

def search_strings(str_list, search_query):
    regex = re.compile(search_query)
    result = []
    for string in str_list:
        match = regex.match(string)
        if match is not None:
            result+=[match.group()]
    return result

strList= ['obj_1_mesh',
          'obj_2_mesh',
          'obj_TMP',
          'mesh_1_TMP',
          'mesh_2_TMP',
          'meshTMP']

print search_strings(strList, '.+_1_.+')

这应该返回['obj_1_mesh'，'mesh_1_TMP']。我试图复制'* _1_ *'的情况。对于'mesh_ *'，您可以使search_query'为网格_。+'。这是python正则表达式api的链接：https://docs.python.org/2/library/re.html

Answer 3

最简单的方法是使用fnmatch，如ma3oun的回答所示。但是，这是使用Regular Expressions，也就是正则表达式来实现它的方法。

首先，我们转换您的searchFor模式，然后使用'.+?'作为＆＃34;通配符＆＃34;而不是'*'。然后我们将结果编译成正则表达式模式对象，这样我们就可以有效地使用它进行多次测试。

有关正则表达式语法的说明，请参阅文档。但简单来说，点表示任何字符（在此行上），+表示查找其中一个或多个，?表示进行非贪婪匹配，即匹配最小的字符串符合模式而不是最长的模式（这是贪婪的匹配所做的）。

import re

strList = ['obj_1_mesh',
           'obj_2_mesh',
           'obj_TMP',
           'mesh_1_TMP',
           'mesh_2_TMP',
           'meshTMP']

searchFor = '*_1_*'
pat = re.compile(searchFor.replace('*', '.+?'))

result = [s for s in strList if pat.match(s)]
print(result)

<强>输出

['obj_1_mesh', 'mesh_1_TMP']

如果我们使用searchFor = 'mesh_*'，则结果为

['mesh_1_TMP', 'mesh_2_TMP']

请注意，此解决方案不健全。如果searchFor包含在正则表达式中具有特殊含义的其他字符，则它们必须为escaped。实际上，不是进行searchFor.replace转换，而是首先使用正则表达式语法编写模式会更清晰。

Answer 4

如果您要查找的字符串看起来总是像 string 那么您可以使用find函数，您将获得类似的内容：

for s in strList:
    if s.find(searchFor) != -1:
        do_something()

如果你要查找多个字符串（比如abc * 123 * test），你需要查找每个字符串，从你找到的第一个+它的len的索引处找到同一个字符串中的第二个字符串等等。

在字符串列表中搜索任意数量的未知子字符串以代替*

4 个答案: