正则表达式捕获两个字符串之间的句子

时间:2017-10-04 13:22:02

标签: regex email

需要帮助编写正则表达式来捕获两个单词之间的字符串。 下面是我需要捕获整个主题的证明点日志,这些主题可以有空格,特殊字符,可能有也可能没有双引号

尝试在两个单词subject=spamscore

之间进行提取
  

10月4日05:56:32 m0001280 filter_instance1 [29132]:rprt s = 2dcdnyveuh m = 1 x = 2dcdnyveuh-1 mod = mail cmd = msg module = pdr rule = pass action = continue attachments = 0 rcpts = 1 routes = default_inbound size = 79291 guid = LkvQgKjeRdwsasaddt    hdr_mid = LT; 0.1.C1.766.adsdDC.0@omp.hello.com> qid = v94Au9Vj022820 hops-ip = X1.1X.11.X subject = hello world spamscore virusname = duration = 0.661逝去= 1.049

为此,我们确实有正则表达式来捕获

(?P<email_subject>(?<=subject=)(.*)(?=spam))

但这很棘手,因为在某些日志格式中我们没有单词spamscore。请在下面找到另一个示例日志

  

10月4日05:56:32 m0001280 filter_instance1 [29132]:rprt s = 2dcdnyveuh m = 1 x = 2dcdnyveuh-1 mod = mail cmd = msg module = pdr rule = pass action = continue attachments = 0 rcpts = 1 routes = default_inbound size = 79291 guid = LkvQgKjeRdwsasaddt   hdr_mid = LT; 0.1.C1.766.adsdDC.0@omp.hello.com> qid = v94Au9Vj022820 hops-ip = X1.1X.11.X subject =“hello!@@@#@ 42 43(saadxasD)”virusname = duration = 0.661逝去= 1.049

在没有真正依赖spamscorevirusname或双引号作为正则表达式中的分隔符的情况下捕获主题的最佳方法是什么?

1 个答案:

答案 0 :(得分:0)

您需要依赖某事(?<=subject=)"?\K(.*?)(?=spam|virus|")