我需要制作一个函数,以单个字符替换重复的连续字符,例如:
'hiiii how are you??' -> 'hi how are you?'
'aahhhhhhhhhh whyyyyyy' -> 'ah why'
'foo' -> 'fo'
'oook. thesse aree enoughh examplles.' -> 'ok. these are enough examples'
答案 0 :(得分:4)
可以使用itertools.groupby
非常紧凑地表达解决方案:
>>> import itertools
>>> ''.join(g[0] for g in itertools.groupby('hiiii how are you??'))
'hi how are you?'
itertools.groupby
通过给定的键函数将对象归为一组。只要键是等效的,就可以累积组。如果未提供键功能,则使用项目的标识,在这种情况下为字符。
一旦按其标识将它们分组,则可以将对象合并为单个字符串。分组后的对象以元组形式返回,其中包含该对象和一个内部itertools._grouper
对象,出于您的目的,您可以忽略并提取字符。
可以将其转换为以下功能:
def remove_repeated_characters(s):
groups = itertools.groupby(s)
cleaned = ''.join(g[0] for g in groups)
return cleaned
这将产生期望值:
>>> [remove_repeated_characters(s)
for s in ['hiiii how are you??','aahhhhhhhhhh whyyyyyy',
'foo', 'oook. thesse aree enoughh examplles.']]
['hi how are you?', 'ah why', 'fo', 'ok. these are enough examples.']
答案 1 :(得分:3)
您可以尝试使用(.)\1+
之类的正则表达式,即“某物,然后更多相同的东西”,然后将其替换为\1
,即“那是第一物”。
>>> import re
>>> re.sub(r"(.)\1+", r"\1", 'aahhhhhhhhhh whyyyyyy')
'ah why'
>>> re.sub(r"(.)\1+", r"\1", 'oook. thesse aree enoughh examplles.')
'ok. these are enough examples.'
使用functools.partial
(或您喜欢的任何其他方式)使其功能
>>> import functools
>>> dedup = functools.partial(re.sub, r"(.)\1+", r"\1")
>>> dedup('oook. thesse aree enoughh examplles.')
'ok. these are enough examples.'
答案 2 :(得分:2)
def dup_char_remover(input):
output=""
t=""
for c in input:
if t!=c:
output = output + c
t=c
return output
input = "hiiii how arrrre youuu"
output=dup_char_remover(input)
print(output)
嗨,你好吗
答案 3 :(得分:1)
使用简单的迭代。
演示:
def cleanText(val):
result = []
for i in val:
if not result:
result.append(i)
else:
if result[-1] != i:
result.append(i)
return "".join(result)
s = ['hiiii how are you??', 'aahhhhhhhhhh whyyyyyy', 'foo', 'oook. thesse aree enoughh examplles.']
for i in s:
print(cleanText(i))
输出:
hi how are you?
ah why
fo
ok. these are enough examples.
答案 4 :(得分:0)
from collections import OrderedDict
def removeDupWord(word):
return "".join(OrderedDict.fromkeys(word))
def removeDupSentence(sentence):
words = sentence.split()
result = ''
return ''.join([result + removeDupWord(word) + ' ' for word in words])
sentence = 'hiiii how are you??'
print (removeDupSentence(sentence))
>>> hi how are you?