如果我有一个清单" abbabaabbaba"我想找到一些子串的有序出现" ab"," bb",我可以这样做多个.find()调用:
def foo(string, substrings):
tuples = []
for substring in substrings:
string_copy = string
while string_copy.find(substring) != -1:
index = string_copy.find(substring)
string_copy = string_copy[index:]
tuples.append((index, substring))
return sorted(tuples)
但可能有更短的方式吗?类似的东西:
def bar(string, substring):
return ((index, substring) for substring in string.find(substring) if index != -1)
(但有效)
例:
foo("abbabaabbaba", ["ab", "bb])
>>> [(0, "ab"), (1, "bb"), (3, "ab"), (6, "ab"), (7, "bb"), (9, "ab")]
答案 0 :(得分:1)
您可以像这个例子一样使用list comprehension
和string slicing
:
def get_occurrence(a, args, step=2):
return [(k, a[k:k+step]) for k in range(len(a)) if a[k:k+step] in args]
a = "abbabaabbaba"
occurrences = get_occurrence(a, ['ab', 'bb'])
print(occurrences)
输出:
[(0, 'ab'), (1, 'bb'), (3, 'ab'), (6, 'ab'), (7, 'bb'), (9, 'ab')]
答案 1 :(得分:0)
我可以使用hot new regex
library吗?
import regex
def foo(string, substrings):
pattern = '(' + '|'.join(regex.escape(s) for s in substrings) + ')'
return [(match.start(0), match.groups(1)[0])
for match in regex.finditer(pattern, string, overlapped=True)]
foo("abbabaabbaba", ["ab", "bb"])
# -> [(0, 'ab'), (1, 'bb'), (3, 'ab'), (6, 'ab'), (7, 'bb'), (9, 'ab')]