Question

是否可以在 Python 中使用正则表达式搜索字符串中的重复单词？

例如：

string = ("Hello World hello mister rain")

re.search(r'[\w ]+[\w ]+[\w ]+[\w ]+[\w ]', string)

我可以这样做，所以我不必重复[\w ]+[\w ]。我不能指定[\w ]*5吗？

Answer 1

我认为使用普通的Python会更容易：

from collections import Counter

string = "Hello World hello mister rain" # note: no () needed
words = string.split()

for word, count in Counter(map(str.lower, words)).iteritems():
    if count > 1:
        print "The word '{}' is repeated {} times.".format(word, count)

Answer 2

要匹配字符串中的第一个重复单词，您可以使用：

re.match(r'.*(\b\w+\b).*\1', "hello World hello mister rain")

\b匹配单词的边界。

\1匹配使用()

定义的组的内容

很抱歉，但我不确定这是不是你想要的。

如何使用正则表达式在Python中搜索字符串中的重复单词？

2 个答案: