我有一个这样的字符串:
string = 'aaabbbcccddd'
然后我希望有一个列表,其中包含3个索引长的所有部分,所以:
aaa, aab, abb, bbb, bbc, bcc, ccc, ccd, cdd, ddd
我如何到达那里?因为re.finditer
& re.findall
不会重复匹配,我确实需要。
答案 0 :(得分:5)
嗯,有一种简单的方法可以做到这一点:
>>> for a, b, c in zip(string[:], string[1:], string[2:]):
... print(a, b, c)
...
a a a
a a b
a b b
b b b
b b c
b c c
c c c
c c d
c d d
d d d
这使用列表理解:
>>> ["".join(var) for var in zip(string, string[1:], string[2:])]
['aaa', 'aab', 'abb', 'bbb', 'bbc', 'bcc', 'ccc', 'ccd', 'cdd', 'ddd']
答案 1 :(得分:4)
您想在字符串上创建一个滑动窗口:
from itertools import islice
def window(seq, n=2):
"Returns a sliding window (of width n) over data from the iterable"
" s -> (s0,s1,...s[n-1]), (s1,s2,...,sn), ... "
it = iter(seq)
result = tuple(islice(it, n))
if len(result) == n:
yield result
for elem in it:
result = result[1:] + (elem,)
yield result
print [''.join(slice) for slice in window(string, 3)]
这会产生:
>>> string = 'aaabbbcccddd'
>>> [''.join(slice) for slice in window(string, 3)]
['aaa', 'aab', 'abb', 'bbb', 'bbc', 'bcc', 'ccc', 'ccd', 'cdd', 'ddd']
答案 2 :(得分:3)
肯定可以改进的替代方案:
>>> s = 'aaabbbcccddd'
>>> [s[i:i+3] for i in range(len(s)-2)]
['aaa', 'aab', 'abb', 'bbb', 'bbc', 'bcc', 'ccc', 'ccd', 'cdd', 'ddd']