正则表达式从电子邮件中提取相关信息

时间:2014-10-14 09:56:22

标签: regex node.js mechanicalturk

给出以下文字:

Greetings from Amazon Mechanical Turk,

You are receiving this email because you subscribed to be notified when
certain events related to your HITs or Qualifications occurred.

Specific event information is shown below:

    Event Type: HITExpired
    Event Time: 2014-10-14T08:00:05Z
    HIT Type ID: 3UY3BQX0VV1BL434D90TUMKT09C20F
    HIT ID: 37VHPF5VYC3MCKRQBEXYFY64LO48CJ


Sincerely,
Amazon Mechanical Turk
https://requestersandbox.mturk.com
410 Terry Avenue North
SEATTLE, WA 98109-5210 USA

我如何提取最终信息:

{
    'Event Type': 'HITExpired',
    'Event Time': '2014-10-14T08:00:05Z',
    'HIT Type ID': '3UY3BQX0VV1BL434D90TUMKT09C20F',
    'HIT ID': '37VHPF5VYC3MCKRQBEXYFY64LO48CJ'
}

1 个答案:

答案 0 :(得分:2)

a = {}; str.replace(/^\s+([^:\n]*):\s*(.*)$/mg, function(_, k, v) { a[k] = v; })

在此之后,a将是您想要的结构。 replace将为每个匹配调用一次该函数,该模式是以一些空格开头的行,一个键,一个冒号,可能还有一些空格,然后是一个值;该函数只会将键和值填充到对象a中。

编辑:修复了愚蠢的错误。另外,我假设只有所需的行会缩进。