用数字将重复的字母编码为字符串

时间:2018-11-20 20:42:40

标签: python

字符串“ abc”必须变为“ a1b1c1”。 字符串'aaabcca'-'a3b1c2a1'

我写了python函数,但是它无法添加最后一个字母,并且'abc'只是'a1b1'。

html

4 个答案:

答案 0 :(得分:3)

使用itertools.groupby

>>> from itertools import groupby
>>> s = 'aaabcca'
>>> ''.join('{}{}'.format(c, sum(1 for _ in g)) for c, g in groupby(s))
'a3b1c2a1'

groupby产生的内容的详细信息:

>>> groups = groupby(s)
>>> [(char, list(group)) for char, group in groups]
[('a', ['a', 'a', 'a']), ('b', ['b']), ('c', ['c', 'c']), ('a', ['a'])]

答案 1 :(得分:3)

一些 regex 魔术:

import re

s = 'aaawbbbccddddd'
counts = re.sub(r'(.)\1*', lambda m: m.group(1) + str(len(m.group())), s)
print(counts)

输出:

a3w1b3c2d5

详细信息

正则表达式模式:

  • (.)-将字符.(任意字符)捕获到第一个捕获的组中
  • \1*-匹配零个或多个连续的\1,这是对第一个捕获的组值的引用(匹配相同字符的可能序列)

替换:

  • m.group(1)-包含第一个匹配的组值
  • str(len(m.group()))-获取匹配的整个字符序列的长度

答案 2 :(得分:1)

您忘记显式添加最后一个迭代。

string = "aaabb"
coded = ''
if len(string) == 0:
   print('')
else:
  count = 1   #start with the first char, not zero!
  prev = string[0]
  for i in range(1,len(string)):
    current = string[i]
    if current == prev:     
       count +=1
    else:              
      coded += prev
      coded += str(count)
      count = 1
      prev = current
coded += prev       # these two
coded += str(count) # lines

print(coded)

不过,我希望循环比较简单:

string = "aaabbcc"
coded = ''
while string:
    i = 0
    while i < len(string) and string[0] == string[i]:
        i += 1
    coded += string[0]+str(i)
    string = string[i:]

print(coded)

答案 3 :(得分:1)

如果您想知道为什么代码不起作用或不想使用任何外部库,请参见这里的代码工作版本

string = "aaabbcc"
coded = ''

if len(string) == 0:
   print('')

else:
  count = 0
  prev = string[0]
  for i in range(1,len(string)):
    current = string[i]
    count +=1

    if current != prev:
      coded += prev
      coded += str(count)
      count = 0

    prev = current

  coded += current
  coded += str(count+1)

print(coded) # -> a3b2c2