如何找到并摆脱连续重复的标点符号而不使用python中的正则表达式?

时间:2015-09-09 17:03:07

标签: python punctuation

我想摆脱重复的连续标点符号,只留下其中一个。

如果我有 string = 'Is it raining????', 我想得到 string = 'Is it raining?' 但我不想摆脱'...'

我还需要在不使用正则表达式的情况下执行此操作。我是python的初学者,非常感谢任何建议或提示。谢谢:))

3 个答案:

答案 0 :(得分:2)

另一种groupby方法:

from itertools import groupby 
from string import punctuation

punc = set(punctuation) - set('.')

s = 'Thisss is ... a test!!! string,,,,, with 1234445556667 rrrrepeats????'
print(s)

newtext = []
for k, g in groupby(s):
    if k in punc:
        newtext.append(k)
    else:
        newtext.extend(g)

print(''.join(newtext))

<强>输出

Thisss is ... a test!!! string,,,,, with 1234445556667 rrrrepeats????
Thisss is ... a test! string, with 1234445556667 rrrrepeats?

答案 1 :(得分:0)

以下方法如何:

import string

text = 'text = 'Is it raining???? No but...,,,, it is snoooowing!!!!!!!''

for punctuation in string.punctuation:
    if punctuation != '.':
        while True:
            replaced =  text.replace(punctuation * 2, punctuation)
            if replaced == text:
                break
            text = replaced

print text

这将提供以下输出:

Is it raining? No but..., it is snoooowing!

或者更高效的版本:

import string

text = 'Is it raining???? No but...,,,, it is snoooowing!!!!!!!'
last = None
output = []

for c in text:
    if c == '.':
        output.append(c)
    elif c != last:
        if c in string.punctuation:
            last = c
        output.append(c)

print ''.join(output)

答案 2 :(得分:-1)

from itertools import groupby

s = 'Is it raining???? okkkk!!! ll... yeh""" ok?'
replaceables = [ch for i, ch in enumerate(s) if i > 0 and s[i - 1] == ch and (not ch.isalpha() and ch != '.')]
replaceables = [list(g) for k, g in groupby(replaceables)]

start = 0
for replaceable in replaceables:
    replaceable = ''.join(replaceable)
    start = s.find(replaceable, start)
    r = s[start:].replace(replaceable, '', 1)
    s = s.replace(s[start:], r)
print s