我有一串字母'aaabbbcccdddeeefffggg
',我想将其读作3个字母的单词。例如,'aaa','bbb','ccc'...
您知道的任何代码可以执行此功能吗?
我的最终目标是为每个单词分配数字,如
aaa= 123
bbb= 234
ccc= 356 ...
并将输出作为具有该值的句子中单词的位置
所以对于'aaabbbcccdddeeefffggg
'的句子
三个字母的单词是'aaa','bbb','ccc'...
aaa
将是第一个位置(1),bbb
将是第二个位置(2),ccc
将是第三个位置(3)
所以最终我会得到
(1,123),(2,234),(3,356) for 'aaa','bbb','ccc'
我已经尝试了几个小时,我无法弄清楚如何做到这一点,所以任何帮助将不胜感激
感谢
答案 0 :(得分:1)
这样的东西?
data = 'aaabbbcccdddeeefffggg'
trans = {'aaa': 123, 'bbb': 234, 'ccc': 356, ...}
[(x + 1, trans[y * 3]) for x, y in enumerate(data[::3])]
否则:
def trans(c):
a = ord('a')
return ord(c) - a + 3 + 10 * (ord(c) - a + 2) + 100 * (ord(c) - a + 1)
data = 'aaabbbcccdddeeefffggg'
[(x + 1, trans(y)) for x, y in enumerate(data[::3])]
答案 1 :(得分:0)
>>> import re
>>> re.findall(".{3}" ,"aaabbbcccdddeeefffggg")
['aaa', 'bbb', 'ccc', 'ddd', 'eee', 'fff', 'ggg']
答案 2 :(得分:0)
>>> a = "aaabbbcccdddeeefffggg"
>>> [a[i:i+3] for i in range(0, len(a), 3)]
['aaa', 'bbb', 'ccc', 'ddd', 'eee', 'fff', 'ggg']
答案 3 :(得分:0)
ch = 'bbbiiieeefffhhhaaacccddd'
d = dict(zip(('aaa','bbb','ccc','ddd','eee','fff','ggg','hhh','iii'),
('123','234','345','456','567','678','789','8910','91011')))
def lect(x):
gen = iter(x)
while True:
yield ''.join((gen.next(),gen.next(),gen.next()))
print [ (i+1,d[x]) for i,x in enumerate(lect(ch)) ]
或
import re
ch = 'bbbiiieeefffhhhaaacccddd'
d = dict(zip(('aaa','bbb','ccc','ddd','eee','fff','ggg','hhh','iii'),
('123','234','345','456','567','678','789','8910','91011')))
pat = re.compile('|'.join(d.iterkeys()))
print [ ((mat.start()/3)+1,d[mat.group()]) for mat in pat.finditer(ch) ]