Question

我有一个字符串（电子邮件），我需要搜索并找到单词“停机时间”，该单词后面的8个字符和从：到：搜索项之前的时间。例如，

mystring="""
AB\r\n\r\n--_=_swift_v4_13613629825124c026192e8_=_\r\nContent-Type: multipart/related;\r\n  oundary="_=_swift_v4_13613629825124c02620826_=_"\r\n\r\n--_ =_swift_v4_13613629825124c02620826_=_\r\n
From: 2013-01-11 04:26:07, To: 2013-01-11 05:56:08, Downtime: 1h 30m 01s\r\n\r\n
some more text here From: 2013-01-29 04:51:07, To: 2013-01-29 05:41:07, Downtime: 0h 50m 00s\r\n\r\n\r\n\r\n\r\n This is a scheduled report from If you wish to no longer receive t=\r\nhis report you can unsubscribe by logging in to and u=\r\npdate your email report settings.\r\nCopyright: 2013 
"""

预期结果：

From: 2013-01-11 04:26:07, To: 2013-01-11 05:56:08, Downtime: 1h 30m 01s
From: 2013-01-29 04:51:07, To: 2013-01-29 05:41:07, Downtime: 0h 50m 00s

Answer 1

您可以使用

形式的正则表达式

From:\s*[^,]+,\s*To:\s[^,]+,\s*Downtime:[\w ]+

测试

>>> import re
>>> re.findall(r'From:\s*[^,]+,\s*To:\s[^,]+,\s*Downtime:[\w ]+',  mystring)
['From: 2013-01-11 04:26:07, To: 2013-01-11 05:56:08, Downtime: 1h 30m 01s', 'From: 2013-01-29 04:51:07, To: 2013-01-29 05:41:07, Downtime: 0h 50m 00s']

Answer 2

虽然nu11p01n73R的答案有效（我认为，我自己并没有看过正则表达式），但你可以很简单地使用字符串操作。

mystring="""AB\r\n\r\n--_=_swift_v4_13613629825124c026192e8_=_\r\nContent-Type: 
multipart/related;\r\n  oundary="_=_swift_v4_13613629825124c02620826_=_"\r\n\r\n--_ 
=_swift_v4_13613629825124c02620826_=_\r\n
From: 2013-01-11 04:26:07, To: 2013-01-11 05:56:08, Downtime: 1h 30m 01s\r\n\r\n
some more text here From: 2013-01-29 04:51:07, To: 2013-01-29 05:41:07, Downtime: 0h 50m 
00s\r\n\r\n\r\n\r\n\r\n This is a scheduled report from If you wish to no longer receive 
t=\r\nhis report you can unsubscribe by logging in to and u=\r\npdate your email report 
settings.\r\nCopyright: 2013 
"""  #imported from where ever and however

from_loc = mystring.find("From: ")
dtime_right = mystring.find("\r\n",from_loc) #find the end of the line after downtime
msg = mystring[from_loc:dtime_right] #string splicing

＆GT;＆GT;＆GT;打印消息

来自：2013-01-11 04:26:07，收件人：2013-01-11 05:56:08，停机时间：1小时30分01秒

注意：如果您想出于某种原因保存在线上，可以将其压缩为1行：

 msg = mystring[mystring.find("From: "):dtime_right = mystring.find("\r\n",from_loc = mystring.find("From: "))]

这真的凌乱，我不推荐它，但选项就在那里：P

Answer 3

试试这个

 import re
    p = re.compile(ur'from:\s*([0-9\-\s:]+),\s*to:([0-9\-\s:]+),\s*downtime:\s*([0-9\shms]+)', re.MULTILINE | re.IGNORECASE)
    test_str = u"AB\r\n\r\n--_=_swift_v4_13613629825124c026192e8_=_\r\nContent-Type: multipart/related;\r\n oundary=\"_=_swift_v4_13613629825124c02620826_=_\"\r\n\r\n--_ =_swift_v4_13613629825124c02620826_=_\r\n\nFrom: 2013-01-11 04:26:07, To: 2013-01-11 05:56:08, Downtime: 1h 30m 01s\r\n\r\n\nsome more text here From: 2013-01-29 04:51:07, To: 2013-01-29 05:41:07, Downtime: 0h 50m 00s\r\n\r\n\r\n\r\n\r\n This is a scheduled report from If you wish to no longer receive t=\r\nhis report you can unsubscribe by logging in to and u=\r\npdate your email report settings.\r\nCopyright: 2013 \n"

    re.findall(p, test_str)

live demo

找到一个字符串以及其他一些单词

3 个答案: