我正在尝试创建一个程序,该程序将列出父字符串中出现子字符串的位置。例如,假设我们在父字符串“abcabcabcabcabcabca”中搜索“bc”,程序将返回1,4,7,10,13,16。
到目前为止,我一直在使用:
import string
def subStringMatchExact():
print "This program will index the locations a given sequence"
print "occurs within a larger sequence"
seq = raw_input("Please input a sequence to search within: ")
sub = raw_input("Please input a sequence to search for: ")
n = 0
for i in seq:
x = string.find(seq, sub [n:])
print x
n = x + 1
我也尝试用string.index运算符替换string.find。任何建议都将不胜感激。
答案 0 :(得分:3)
只需在输入字符串本身上调用.find()
即可。它会返回匹配的位置,如果找不到匹配则返回-1
。它还需要 start 参数,因此您可以查找 next 匹配:
def subStringMatchExact():
print "This program will index the locations a given sequence"
print "occurs within a larger sequence"
seq = raw_input("Please input a sequence to search within: ")
sub = raw_input("Please input a sequence to search for: ")
positions = []
pos = -1
while True:
pos = seq.find(sub, pos + 1) # start searching *beyond* the previous match
if pos == -1: # Not found
break
positions.append(pos)
return positions
答案 1 :(得分:3)
我很懒,所以我会使用re.finditer
:
>>> import re
>>> s = "abcabcabcabcabcabca"
>>> for m in re.finditer('bc',s):
... print m.start()
...
1
4
7
10
13
16
答案 2 :(得分:0)
列表理解是一种非常优雅的方式,如果这对你很重要:
>>> seq = "abcabcabcabcabcabca"
>>> sub = "bc"
>>> [i for i in range(len(seq)) if seq[i:].startswith(sub)]
[1, 4, 7, 10, 13, 16]
这也应该是最快的解决方案。它遍历字符串并尝试查看是否在任何位置,剩余的字符串(从该位置到结尾)以指定的子字符串开头。如果是,它会收集该位置,如果不是,它会继续下一个位置。