我有一个看起来像的字符串:
string = 'TTHHTHHTHHHHTTHHHTTT'
如何计算字符串中的运行次数,以便我得到,
5次运行T和4次运行H
答案 0 :(得分:21)
您可以结合使用itertools.groupby
和collections.Counter
:
>>> from itertools import groupby
>>> from collections import Counter
>>> strs = 'TTHHTHHTHHHHTTHHHTTT'
>>> Counter(k for k, g in groupby(strs))
Counter({'T': 5, 'H': 4})
itertools.groupby
根据键对项目进行分组。(默认情况下,键是迭代本身中的项目)
>>> from pprint import pprint
>>> pprint([(k, list(g)) for k, g in groupby(strs)])
[('T', ['T', 'T']),
('H', ['H', 'H']),
('T', ['T']),
('H', ['H', 'H']),
('T', ['T']),
('H', ['H', 'H', 'H', 'H']),
('T', ['T', 'T']),
('H', ['H', 'H', 'H']),
('T', ['T', 'T', 'T'])]
此处的第一项是密钥(k
),根据该密钥对项目进行分组,list(g)
是与该密钥相关的组。由于我们只对key
部分感兴趣,因此我们可以将k
传递给collections.Counter
以获得所需的答案。
答案 1 :(得分:2)
对于多样性,基于re
的方法
import re
letters = ['H', 'T']
matches = re.findall(r'({})\1*'.format('|'.join(letters)), 'TTHHTHHZTHHHHTTHHHTTT')
print matches
['T', 'H', 'T', 'H', 'T', 'H', 'T', 'H', 'T']
[(letter, matches.count(letter)) for letter in letters]
[('H', 4), ('T', 5)]