如果字符串包含另一个字符串 Python 的元素列表中的单词,则提取该字符串的一部分

时间:2021-05-13 19:25:01

标签: python-3.x list

如果来自 msgs 的任何元素与 a_string 的内容匹配,我想提取 a_string (abc@xyz.com) 的后部分。目前我只能搜索 1 个元素,即电子邮件地址:通过使用以下代码对搜索测试进行硬编码,但我想从 msgs 列表中比较多个元素。有人可以帮忙吗?

请注意它是来自 Outlook 邮件的多行文本,可以是电子邮件地址:电子邮件地址

Mail delivery was not successful.
Delivery/update and destination information:
Run at: 5/12/2021 5:00:21 PM
Recipient name: John Doe
Email addresses: abc@xyz.com
Error message: Skipping content as no data returned.



  a_string = 'Error message: abc@xyz.com'
  msgs =['Email addresses: ', 'Email address: ']

  matches = re.finditer(r"Email address:\s(.*)$", a_string  re.MULTILINE)
  for match in matches:
      emailaddress=match.group(1)
      print(emailaddress)

3 个答案:

答案 0 :(得分:0)

您可以使用 findall

import re
a_string = 'Error message: abc@xyz.com'
msgs =['Email addresses: ', 'Email address: ', 'Error message: ']

re.findall(fr"{'|'.join(msgs)}(.*)", a_string)
['abc@xyz.com']

答案 1 :(得分:0)

你可以试试:

>>> import re
>>>
>>> a_string = """Mail delivery was not successful.
... Delivery/update and destination information:
... Run at: 5/12/2021 5:00:21 PM
... Recipient name: John Doe
... Email addresses: abc@xyz.com
... Error message: Skipping content as no data returned."""
>>> msgs = ['Email address:', 'Email addresses:']
>>> 
>>> p = rf"(?:{'|'.join(msgs)})\s(\S+)" 
>>> pat = re.compile(p)
>>> matches = pat.finditer(a_string, re.MULTILINE)
>>> 
>>> for match in matches:
...    print(match.group(1))
... 
abc@xyz.com

注意:?: 使第一组不被捕获,因此您可以使用 match.group(1) 获取电子邮件地址。如果您需要 msgs 的哪个元素与电子邮件地址匹配,您可以删除 ?:。然后,电子邮件地址将位于 match.group(2)match.group(1) 将是来自 msgs 的元素。

答案 2 :(得分:0)

要感谢发布此答案的另一位用户。

p = rf"(?:{'|'.join(msgs)})\s(\S+)" 
pat = re.compile(p)
matches = pat.finditer(a_string, re.MULTILINE)
for match in matches:
    print(match.group(1))
相关问题