Question

我正在尝试创建一个程序，该程序将列出父字符串中出现子字符串的位置。例如，假设我们在父字符串“abcabcabcabcabcabca”中搜索“bc”，程序将返回1,4,7,10,13,16。

到目前为止，我一直在使用：

import string

def subStringMatchExact():
    print "This program will index the locations a given sequence"
    print "occurs within a larger sequence"
    seq = raw_input("Please input a sequence to search within: ")
    sub = raw_input("Please input a sequence to search for: ")
    n = 0
    for i in seq:
        x = string.find(seq, sub [n:])
        print x
        n = x + 1

我也尝试用string.index运算符替换string.find。任何建议都将不胜感激。

Answer 1

只需在输入字符串本身上调用.find()即可。它会返回匹配的位置，如果找不到匹配则返回-1。它还需要 start 参数，因此您可以查找 next 匹配：

def subStringMatchExact():
    print "This program will index the locations a given sequence"
    print "occurs within a larger sequence"
    seq = raw_input("Please input a sequence to search within: ")
    sub = raw_input("Please input a sequence to search for: ")

    positions = []
    pos = -1
    while True:
        pos = seq.find(sub, pos + 1)  # start searching *beyond* the previous match
        if pos == -1:   # Not found
            break
        positions.append(pos)
    return positions

Answer 2

我很懒，所以我会使用re.finditer：

>>> import re
>>> s = "abcabcabcabcabcabca"
>>> for m in re.finditer('bc',s):
...     print m.start()
... 
1
4
7
10
13
16

Answer 3

列表理解是一种非常优雅的方式，如果这对你很重要：

>>> seq = "abcabcabcabcabcabca"
>>> sub = "bc"
>>> [i for i in range(len(seq)) if seq[i:].startswith(sub)]
[1, 4, 7, 10, 13, 16]

这也应该是最快的解决方案。它遍历字符串并尝试查看是否在任何位置，剩余的字符串（从该位置到结尾）以指定的子字符串开头。如果是，它会收集该位置，如果不是，它会继续下一个位置。

Python：索引字符串中子的位置

3 个答案: