Question

我有一个字符串列表（大约100个），我想在另一个字符串中查找其中一个字符串及其出现的索引。

我保留索引，然后使用该索引中的另一个单词列表再次搜索，然后返回到第一个列表，直到它到达字符串的末尾。

我当前的代码（搜索第一次出现的代码）如下所示：

        def findFirstOccurence(wordList, bigString, startIndex):
            substrIndex = sys.maxint
            for word in wordList:
                tempIndex = bigString.find(word, startIndex)
                if tempIndex < substrIndex and tempIndex != -1:
                    substrIndex = tempIndex
            return substrIndex

这个代码完成了这项工作，但需要花费很多时间（我为相同的单词列表运行了几次，但是在100个大字符串中运行（每个约10K-20K字）。

我确信这是一种更好的方式（并采用更加蟒蛇的方式）。

Answer 1

这似乎运作良好，并告诉你它找到了什么词（虽然可以省略）：

words = 'a big red dog car woman mountain are the ditch'.split()
sentence = 'her smooth lips reminded me of the front of a big red car lying in the ditch'

from sys import maxint
def find(word, sentence):
    try:
        return sentence.index(word), word
    except ValueError:
        return maxint, None
print min(find(word, sentence) for word in words)

Answer 2

列表理解的一个班轮将是

return min([index for index in [bigString.find(word, startIndex) for word in wordList] if index != -1])

但我认为，如果你把它分成两行，它更具可读性

indexes = [bigString.find(word, startIndex) for word in wordList]
return min([index for index in indexes if index != -1])

Answer 3

import re

def findFirstOccurence(wordList, bigString, startIndex=0):
    return re.search('|'.join(wordList), bigString[startIndex:]).start()

wordList = ['hello', 'world']
bigString = '1 2 3 world'

print findFirstOccurence(wordList, bigString)

如何从python中另一个字符串的列表中找到第一次出现的字符串

3 个答案: