根据给定的输入:
我可以更好地做waaaaaaaaaaaaaay:DDDD !!!!我太兴奋了:) :))好!!
期望:输出
我可以做得更好/液化天然气更好:D / LNG!/ LNG我是这样/液化天然气是关于它的:)/液化天然气好!/液化天然气
---挑战:
----问题:错误消息“不平衡的括号”
有什么想法吗?
我的代码是:
import re
lengWords = {} # a dictionary of lengthened words
def removeDuplicates(corpus):
data = (open(corpus, 'r').read()).split()
myString = " ".join(data)
for word in data:
for chr in word:
countChr = word.count(chr)
if countChr >= 3:
lengWords[word] = word+"/LNG"
lengWords[word] = re.sub(r'([A-Za-z])\1+', r'\1', lengWords[word])
lengWords[word] = re.sub(r'([\'\!\~\.\?\,\.,\),\(])\1+', r'\1', lengWords[word])
for k, v in lengWords.items():
if k == word:
re.sub(word, v, myString)
return myString
答案 0 :(得分:1)
这不是完美的解决方案,但我现在没有时间对其进行改进 - 只是想让您从简单的方法入手:
s = "I can do waaaaaaaaaaaaay better :DDDD!!!! I am sooooooooo exicted about it :))) Good !!"
re.sub(r'(.)(\1{2,})',r'\1/LNG',s)
>> 'I can do wa/LNGy better :D/LNG!/LNG I am so/LNG exicted about it :)/LNG Good !!'