使用Regex消除Python中有两个或多个句点的单词?

时间:2017-09-10 07:01:55

标签: python regex

例如,如果我有一个字符串:

"I really..like something like....that"

我只想得到:

"I something"

有什么建议吗?

3 个答案:

答案 0 :(得分:3)

如果你想用正则表达式做;您可以使用下面的正则表达式删除它们:

r"[^\.\s]+\.{2,}[^\.\s]+"g

[ Regex Demo ]

正则表达式解释:

[^\.\s]+       at least one of any character instead of '.' and a white space
\.{2,}         at least two or more '.'
[^\.\s]+       at least one of any character instead of '.' and a white space

或此正则表达式:

r"\s+[^\.\s]+\.{2,}[^\.\s]+"g
  ^^^  for including spaces before those combination

[ Regex Demo ]

答案 1 :(得分:1)

如果要明确使用正则表达式,可以使用以下内容。

import re

string = "I really..like something like....that"
with_dots = re.findall(r'\w+[.]+\w+', string)

split = string.split()
without_dots = [word for word in split if word not in with_dots]

rawing提供的解决方案也适用于这种情况。

' '.join(word for word in text.split() if '..' not in word)

答案 2 :(得分:0)

您可以很好地将边界与外观结合使用:

\b(?<!\.)(\w+)\b(?!\.)

请参阅a demo on regex101.com

<小时/> 分开了,这说:

\b        # a word boundary
(?<!\.)   # followed by a negative lookbehind making sure there's no '.' behind
\w+       # 1+ word characters
\b        # another word boundary
(?!\.)    # a negative lookahead making sure there's no '.' ahead

<小时/> 整个Python代码段:

import re

string = "I really..like something like....that"

rx = re.compile(r'\b(?<!\.)(\w+)\b(?!\.)')

print(rx.findall(string))
# ['I', 'something']