正则表达式匹配除了两个名称之外的所有内容,并且在特定单词</email>之后匹配<email address =“”>

时间:2012-05-02 21:36:05

标签: regex expression match names negative-lookahead

我在这些汇总的电子邮件中有一堆名称和电子邮件地址,我想在整个文档中删除除First Last <email@domain.com>之外的所有内容。基本上我有......

From: Name Wood <email@gmail.com>
Subject: Yelp entries for iPod contest
Date: April 20, 2012 12:51:07 PM EDT
To: email@domain.cc

Have had a great experience with .... My Son ... is currently almost a year into treatment. Dr. ... is great! Very informative and always updates us on progress and we have our regular visits. The ... buck program is a great incentive which they've implemented to help kids take care of their teeth/braces. They also offer payment programs which help for those of us that need a structured payment option. Wouldn't take my kids anywhere else. Thanks Dr. ... and staff
Text for 1, 2, and 3 entries to Yelp
Hope ... wins!!
Begin forwarded message:

From: Name Wood <email@gmail.com>
Subject: reviews 2 and 3
Date: April 20, 2012 12:44:26 PM EDT
To: email@domain.cc

Have had a great experience with ... Orthodontics. My Son ... is currently almost a year into treatment. Dr. ... is great! Very informative and always updates us on progress and we have our regular visits. The ... buck program is a great incentive which they've implemented to help kids take care of their teeth/braces. They also offer payment programs which help for those of us that need a structured payment option. Wouldn't take my kids anywhere else. Thanks Dr. ... and staff
Have had a great experience with...

我想只匹配......

Name Wood <email@gmail.com>
Name Wood <email@gmail.com>

从这篇文章。所以基本上我想在单词"From: "加上"<"+email address+">"后面的后两个单词匹配,不包括单词"From: "。我从研究中发现,这是一个负面的预测(我认为),搜索两个完整的单词(以某种方式使用{0,2}),然后是一个<个字符到另一个>的电子邮件地址

3 个答案:

答案 0 :(得分:0)

你可以这样做:

/(?:From: )(.*)/g

答案 1 :(得分:0)

这个正则表达式将找到您正在寻找的内容:

(?<=From:)\s*[^<]+<[^>]+>

但是你要解决的问题有点不清楚。匹配的文本可能应该放在一个或多个组中,以便您可以提取所需的文本。 (一个群组中的姓名?在另一个群组中发送电子邮件?或者两者都在一起?)您还没有说出您想要做什么,因此您必须提供更多信息。以上是最简单的情况。

说明:

(?<=From:)   # positive lookbehind to find "From:"
\s*          # optional whitespace
[^<]+<       # everything up to the first '<' (the name)
[^>]+>       # everything up to the '>' (the email)

答案 2 :(得分:0)

如果您想删除除名称和电子邮件之外的所有内容 修饰符''(点包括换行符),
正则表达式的全局查找和替换是$1\n

速度更快,但会在过剩时留下额外的换行符。

Find .*?From:[^\S\n]*([^<\n]+<[^>\n]*\@[^>\n]*>)|.*$

这比较慢(使用前瞻)但不会留下额外的换行符。

Find  .*?From:[^\S\n]*([^<\n]+<[^>\n]*\@[^>\n]*>)(?:(?!From:[^\S\n]*[^<\n]+<[^>\n]*\@[^>\n]*>).)*