删除除用户写入的文本之外的所有内容

时间:2017-04-04 09:44:28

标签: node.js regex outlook-restapi

我使用Outlook API来获取已发送电子邮件的正文。现在,我想清理正文以删除所有链接,标题等,并仅保留用户编写的文本。以下是我的正则表达式函数:

function getRegex() {

    var regex1 = /^(?=.*Forwarded message)[^]*/m;
    var regex2 = /^(?=.*From: )[^]*/m;
    var regex3 = /^(?=.*On )[^]*/m;
    var regex4 = /^(?=.*http)[^]*/m;

    return new RegExp("(" + regex1.source + ")|(" + regex2.source + ")|(" + regex3.source + ")|(" + regex4.source + ")");
}

以下是从Outlook获取已发送电子邮件的功能:

outlook.mail.getMessages({
    token: token.token.access_token,
    odataParams: queryParams,
    folderId: 'SentItems'

}, function (err, result) {

    if (err){
        console.log(err);
        return;
    }

    var mail_array = result.value;
    var outlook_sent_emails = '';

    mail_array.forEach(function (mail) {

        if (mail.BodyPreview !== '') {
            outlook_sent_emails += (mail.BodyPreview + " ");
        }
    });

    console.log(outlook_sent_emails.replace(getRegex(), ""));  //This is not working
});

此行console.log(outlook_sent_emails.replace(getRegex(), ""));显示我仍然收到所有链接,标题等

同样的正则表达式在我的代码中的其他地方工作。

编辑:

示例文字:

  From: <Name>
    Sent: <Datetime>
    To: <Name>
    Subj Dear Sir/Madam


Hi Vaibhav,

Hope you are doing well.

http://developer.android.com/sdk/index.html

Sent from my Windows 10 phone

我想从字符串中删除所有类型的链接和文本,如下所示:

From: <Name>
Sent: <Datetime>
To: <Name>
Subj Dear Sir/Madam

预期输出

 Hi Vaibhav,

 Hope you are doing well.

1 个答案:

答案 0 :(得分:2)

  

更新:添加了http

你可以试试这个:

^.*(From:|Sent:|Sent\s+From|To:|Subj|Dear\s+(Sir|Madam)|http).*$

并替换为“”

Demo

const regex = /^.*(From:|Sent:|Sent\s+From|To:|Subj|Dear\s+(Sir|Madam)|http).*$/gmi;
const str = `  From: <Name>
    Sent: <Datetime>
    To: <Name>
    Subj Dear Sir/Madam


Hi Vaibhav,

Hope you are doing well.

http://developer.android.com/sdk/index.html

Sent from my Windows 10 phone`;
const subst = ``;
const result = str.replace(regex, subst).trim();
console.log(result);