Question

我需要制作一个函数，以单个字符替换重复的连续字符，例如：

 'hiiii how are you??' -> 'hi how are you?'
 'aahhhhhhhhhh whyyyyyy' -> 'ah why'
 'foo' -> 'fo'
 'oook. thesse aree enoughh examplles.' -> 'ok. these are enough examples'

Answer 1

可以使用itertools.groupby非常紧凑地表达解决方案：

>>> import itertools
>>> ''.join(g[0] for g in itertools.groupby('hiiii how are you??'))
'hi how are you?'

itertools.groupby通过给定的键函数将对象归为一组。只要键是等效的，就可以累积组。如果未提供键功能，则使用项目的标识，在这种情况下为字符。

一旦按其标识将它们分组，则可以将对象合并为单个字符串。分组后的对象以元组形式返回，其中包含该对象和一个内部itertools._grouper对象，出于您的目的，您可以忽略并提取字符。

可以将其转换为以下功能：

def remove_repeated_characters(s):
    groups = itertools.groupby(s)
    cleaned = ''.join(g[0] for g in groups)
    return cleaned

这将产生期望值：

>>> [remove_repeated_characters(s) 
     for s in ['hiiii how are you??','aahhhhhhhhhh whyyyyyy',
               'foo', 'oook. thesse aree enoughh examplles.']]
['hi how are you?', 'ah why', 'fo', 'ok. these are enough examples.']

Answer 2

您可以尝试使用(.)\1+之类的正则表达式，即“某物，然后更多相同的东西”，然后将其替换为\1，即“那是第一物”。

>>> import re
>>> re.sub(r"(.)\1+", r"\1", 'aahhhhhhhhhh whyyyyyy')
'ah why'
>>> re.sub(r"(.)\1+", r"\1", 'oook. thesse aree enoughh examplles.')
'ok. these are enough examples.'

使用functools.partial（或您喜欢的任何其他方式）使其功能

>>> import functools
>>> dedup = functools.partial(re.sub, r"(.)\1+", r"\1")
>>> dedup('oook. thesse aree enoughh examplles.')
'ok. these are enough examples.'

Answer 3

def dup_char_remover(input):
    output=""
    t=""
    for c in input:
        if t!=c:
            output = output + c
        t=c
    return output

input = "hiiii how arrrre youuu"
output=dup_char_remover(input)
print(output)

嗨，你好吗

Answer 4

使用简单的迭代。

演示：

def cleanText(val):
    result = []
    for i in val:
        if not result:
            result.append(i)
        else:
            if result[-1] != i:
                result.append(i)
    return "".join(result)

s = ['hiiii how are you??', 'aahhhhhhhhhh whyyyyyy', 'foo', 'oook. thesse aree enoughh examplles.']
for i in s:
    print(cleanText(i))

输出：

hi how are you?
ah why
fo
ok. these are enough examples.

Answer 5

from collections import OrderedDict

def removeDupWord(word):
   return "".join(OrderedDict.fromkeys(word))

def removeDupSentence(sentence):
    words = sentence.split()
    result = ''
    return ''.join([result + removeDupWord(word) + ' ' for word in words])


sentence = 'hiiii how are you??'
print (removeDupSentence(sentence))

>>> hi how are you?

替换Python中重复的连续字符

5 个答案: