Question

我试图使用Python过滤掉电子邮件正文中的文本。我需要获得邮件的“需要的内容”部分。这是收到邮件时收到的字符串：

'--001a1144b8cc8e9a67055ddfb9ec
Content-Type: text/plain; charset="UTF-8"

Needed Content

--001a1144b8cc8e9a67055ddfb9ec
Content-Type: text/html; charset="UTF-8"

<div dir="ltr">Off</div>

--001a1144b8cc8e9a67055ddfb9ec--
'

我尝试过这样的事情却失败了：

re.findall(r'/\r/\n(.+?)/\r/\n', body)

在换行符之间过滤但失败了.. 提前谢谢！

Answer 1

如果您想匹配[\r\n]，请使用re.findall(r'(?<=[\r\n]).+(?=[\r\n])', body)，如下所示：

re.findall(r'^.+$', body, re.MULTILINE)

但是python re.findall有一个标志来分别处理每一行，这使你的代码更容易阅读：

scanf("%[^0-9\n]", str1);
while(getchar() != '\n'); // this approach is much better bcz it will
                         // remove any number of left characters in buffer.
scanf("%c", &ch);

Answer 2

您可以使用超前断言（?=）。

>>> import re
>>> body='--001a1144b8cc8e9a67055ddfb9ec\nContent-Type: text/plain; charset="UTF-8' 
>>> re.findall(".+(?=\nContent-Type)",body)                                   ['--001a1144b8cc8e9a67055ddfb9ec']

使用Python过滤来自电子邮件正文的字符串

2 个答案: