Question

所以我一直试图找出一种方法来使用正则表达式（正则表达式）从我拥有的文本文件中删除重复的电子邮件，但我无法完成任何工作。

这是电子邮件在文本文件中的方式（示例）

examp@asdas.com
kork@kruu.com
gexx@moxx.com
hey@hayhay.cu
examp@asdas.com
geexx@modxx.com

我还没有找到删除所有重复项的方法，我只在正则表达式中找到了一种方法来删除彼此相对的重复项。

有人有任何建议吗？

Answer 1

怎么样：

搜索：([^@]+@[^@]+)(.*?)\1
替换为：$1$2

正则表达式解释：

The regular expression:

(?-imsx:([^@]+@[^@]+)(.*?)\1)

matches as follows:

NODE                     EXPLANATION
----------------------------------------------------------------------
(?-imsx:                 group, but do not capture (case-sensitive)
                         (with ^ and $ matching normally) (with . not
                         matching \n) (matching whitespace and #
                         normally):
----------------------------------------------------------------------
  (                        group and capture to \1:
----------------------------------------------------------------------
    [^@]+                    any character except: '@' (1 or more
                             times (matching the most amount
                             possible))
----------------------------------------------------------------------
    @                        '@'
----------------------------------------------------------------------
    [^@]+                    any character except: '@' (1 or more
                             times (matching the most amount
                             possible))
----------------------------------------------------------------------
  )                        end of \1
----------------------------------------------------------------------
  (                        group and capture to \2:
----------------------------------------------------------------------
    .*?                      any character except \n (0 or more times
                             (matching the least amount possible))
----------------------------------------------------------------------
  )                        end of \2
----------------------------------------------------------------------
  \1                       what was matched by capture \1
----------------------------------------------------------------------
)                        end of grouping
----------------------------------------------------------------------

Notepad ++正则表达式删除重复的电子邮件

1 个答案: