用于捕获相关收件人字段中的电子邮件地址的Javascript和PHP正则表达式

时间:2014-08-31 04:01:05

标签: javascript php regex node.js email

我正在尝试开发2个正则表达式,一个在javascript中,另一个在php中,它将捕获在原始电子邮件中找到的电子邮件地址,该地址仅与其相应的字段相关(例如: )并且只有它的字段(即语料库中其他地方没有其他电子邮件),但我没有成功。

以下是理想的要求:

  • 必须从新行开始,然后从该行的开头开始。

  • 新行必须以“To”开头,例如(排除双引号,不区分大小写,冒号的单次出现是可选的,可选的单个或无限数量的空格)。

  • 此后必须单独捕获所有电子邮件地址,直至最后一个电子邮件地址,但在非电子邮件地址字之前(非电子邮件地址字的示例,但不具体说明:主题:,来自:,CC:,您好,等等......)

我在#1和#2的要求方面取得了成功,但在#3方面却遇到了困难。我被迫简单地解决#1和#2并简单地根据逗号分割/爆炸结果,这很好,但我知道可以更好。

以下是来自安然电子邮件

的pulic数据集的示例电子邮件
Message-ID: <3470405.1075840065684.JavaMail.evans@thyme>
Date: Sun, 14 Feb 1999 01:33:00 -0800 (PST)
From: markskilling@hotmail.com
To: majalinda@hotmail.com, ksbiehl@hotmail.com, dlmackler@worldnet.att.net, 
    cjones@cityofnapa.org, hazerfen@hotmail.com, meyerjames@usa.net, 
    tomskilljr@aol.com, c.combs@intershop.com, mshachat@aol.com, 
    clowes@email.msn.com, clowes@cmithlaw.com, transwd@aol.com, 
    smackarnes@aol.com, samjstokes@aol.com, joguti@aol.com, 
    bjmackaysmith@hotmail.com, m_larnold@sprynet.com, dwood@rwblaw.com, 
    daveroche@aol.com, milobenn@sirius.com, pwc1@aol.com, 
    candc@ix.netcom.com, eisenbachrl@cooley.com, mwf15@columbia.edu, 
    khuber@hcmwealth.com, doyna@coffeenet.com, katekross@aol.com, 
    mark.langermann@issna.com, martin@sbu.edu, deniz.razon@abbott.com, 
    sras@lycosmail.com, jeff.skilling@enron.com, tskilling@tribune.com, 
    audryn@mindspring.com, mmmmisha@ix.netcom.com, ermak@gte.net
Subject: 
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-From: "Mark Skilling" <markskilling@hotmail.com>
X-To: majalinda@hotmail.com, ksbiehl@hotmail.com, dlmackler@worldnet.att.net, cjones@cityofnapa.org, hazerfen@hotmail.com, meyerjames@usa.net, tomskilljr@aol.com, c.combs@intershop.com, mshachat@aol.com, clowes@email.msn.com, clowes@cmithlaw.com, transwd@aol.com, smackarnes@aol.com, samjstokes@aol.com, joguti@aol.com, bjmackaysmith@hotmail.com, m_larnold@sprynet.com, dwood@rwblaw.com, daveroche@aol.com, milobenn@sirius.com, pwc1@aol.com, candc@ix.netcom.com, eisenbachrl@cooley.com, mwf15@columbia.edu, khuber@hcmwealth.com, doyna@coffeenet.com, katekross@aol.com, mark.langermann@issna.com, martin@sbu.edu, deniz.razon@abbott.com, sras@lycosmail.com, Jeff Skilling, tskilling@tribune.com, audryn@mindspring.com, mmmmisha@ix.netcom.com, ermak@GTE.net
X-cc: 
X-bcc: 
X-Folder: \Jeffrey_Skilling_Dec2000\Notes Folders\All documents
X-Origin: SKILLING-J
X-FileName: jskillin.nsf

February 10, 1999

I am wakened by the approaching chatter of the early morning call to
prayer (sounding a bit like the fuss made by one of those cartoon balls
of fighting dogs and cats).  From the minarets of far away mosques, the
muezzins' cries ricochet through Istanbul's still dark alleys and
streets.  Seagulls, who have drifted up the hill from the Golden Horn,
squawk contentedly outside my window.  From somewhere down below, a
miserable dog joins into the pre-dawn ruckus, soon followed by the local
muezzin, whose amplified singing drowns out all the rest.  He reminds us
that God is great and that prayer is a whole lot more important than
sleep (at least that's what I've been told; he sings in Arabic).
Because my religion thinks more highly of sleep, I feel free to simply
listen, while gently trying to pull the warm blanket of sleep back over
me.  The  muezzin has a beautiful voice.  Its rise and fall stitches
itself into the edges of a dream (in which a former best friend and I
argue about the rules of a game of miniature golf) hanging just out of
reach.

Slowly, the banal calculations that fill my days begin to crowd their
way into my head.  It's about a quarter to six, I figure, which means
there's time for a bit of writing, or even Turkish vocabulary, before I
douse myself in the shower to full consciousness.  I remind myself of
the theory that one can write most freely while still intoxicated with
sleep (or just plain intoxicated), am immediately stricken with the fear
I am incapable of such freedom, take a look round my brain for something
worth writing about (find nothing), hypothesize about the advantages of
a quick dash into the hallway to turn on the gas heater (so that when I
really get up it will be reasonably warm out there), wonder if I really
do have enough stuff prepared to fill up the two hours of my English
lesson with Suleyman, conclude that all this thinking has probably made
any more sleep impossible, then (I realize later) fall back to sleep.

                               *     *     *

My new phone [(212) 292-6486] is hooked up and I have a new internet
server, which will make it much easier to keep in touch.  Hope to attack
that backlog of responses that are due.

Keep in touch.

Mark-O

______________________________________________________
Get Your Private, Free Email at http://www.hotmail.com

感谢您的帮助。 希望这可以使任何其他人搜索受益! :)

以下是我正在使用的正则表达式,它满足要求#1和#2并返回特定字段的收件人blob:

/^(?:To:?(?:\s+)?)((?:(?:(?:(?:[^<>()[\]\\.,;:\s@\"]+(?:\.[^<>()[\]\\.,;:\s@\"]+)*)|(?:\".+\"))@(?:(?:\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])|(?:(?:[a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))),?\s+)+)+/mi;

1 个答案:

答案 0 :(得分:0)

在您为#1和#2解决之后,您可以使用它来从您的选择中获取任何电子邮件地址

[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,4})

这应该抓住任何有效的电子邮件地址并赢得; t抓住无效的,例如:example @ gmail ... com

link:Using a regular expression to validate an email address