如果一个字符重复多次,请删除该单词

时间:2019-02-18 03:50:41

标签: python regex python-3.x

如果单词以4个或更多重复字符开头,我想从句子中删除单词。

eg: 
['aaaaaaa is really good', 'nott something great',
       'ssssssssssssstackoverflow is a great community']

我需要这样的输出: 例如:

['is really good', 'nott something great', 'is a great community']

我尝试过这样的事情:

^(\S)\1{3,}

它会删除那些重复的字符,但不会删除单词。谢谢

1 个答案:

答案 0 :(得分:2)

在模式末尾添加\S*\s

words = ['aaaaaaa is really good', 'nott something great','ssssssssssssstackoverflow is a great community']
newWords = [re.sub(r'^(\S)\1{3,}\S*\s', '', word) for word in words]

输出:

['is really good', 'nott something great', 'is a great community']

如果字符串只能由一个单词组成,则将最后一个空格设为\s?而不是\s