我要删除标点符号,例如“”,“”,,
, "", '' from my string using regex. The code so far I've written only removes the ones which space between them. How do I remove the empty ones such as '',
#Code
s = "hey how ' ' is the ` ` what are '' you doing `` how is everything"
s = re.sub("' '|` `|" "|""|''|``","",s)
print(s)
我的预期结果:
hey how is the what are you doing how is everything
答案 0 :(得分:3)
您可以使用此正则表达式来匹配所有此类引号:
r'([\'"`])\s*\1\s*'
代码:
>>> s = "hey how ' ' is the ` ` what are '' you doing `` how is everything"
>>> print (re.sub(r'([\'"`])\s*\1\s*', '', s))
hey how is the what are you doing how is everything
RegEx详细信息:
([\'"`])
:匹配给定的引号之一并将其捕获在#1组中\s*
:匹配0个或多个空格\1
:使用#1组的向后引用确保我们匹配相同的结束报价\s*
:匹配0个或多个空格答案 1 :(得分:1)
在这种情况下,为什么不匹配所有单词字符,然后将它们结合起来?
' '.join(re.findall('\w+',s))
# 'hey how is the what are you doing how is everything'