我有一个字符串(电子邮件),我需要搜索并找到单词“停机时间”,该单词后面的8个字符和从:到:搜索项之前的时间。 例如,
mystring="""
AB\r\n\r\n--_=_swift_v4_13613629825124c026192e8_=_\r\nContent-Type: multipart/related;\r\n oundary="_=_swift_v4_13613629825124c02620826_=_"\r\n\r\n--_ =_swift_v4_13613629825124c02620826_=_\r\n
From: 2013-01-11 04:26:07, To: 2013-01-11 05:56:08, Downtime: 1h 30m 01s\r\n\r\n
some more text here From: 2013-01-29 04:51:07, To: 2013-01-29 05:41:07, Downtime: 0h 50m 00s\r\n\r\n\r\n\r\n\r\n This is a scheduled report from If you wish to no longer receive t=\r\nhis report you can unsubscribe by logging in to and u=\r\npdate your email report settings.\r\nCopyright: 2013
"""
预期结果:
From: 2013-01-11 04:26:07, To: 2013-01-11 05:56:08, Downtime: 1h 30m 01s
From: 2013-01-29 04:51:07, To: 2013-01-29 05:41:07, Downtime: 0h 50m 00s
答案 0 :(得分:1)
您可以使用
形式的正则表达式From:\s*[^,]+,\s*To:\s[^,]+,\s*Downtime:[\w ]+
测试
>>> import re
>>> re.findall(r'From:\s*[^,]+,\s*To:\s[^,]+,\s*Downtime:[\w ]+', mystring)
['From: 2013-01-11 04:26:07, To: 2013-01-11 05:56:08, Downtime: 1h 30m 01s', 'From: 2013-01-29 04:51:07, To: 2013-01-29 05:41:07, Downtime: 0h 50m 00s']
答案 1 :(得分:1)
虽然nu11p01n73R的答案有效(我认为,我自己并没有看过正则表达式),但你可以很简单地使用字符串操作。
mystring="""AB\r\n\r\n--_=_swift_v4_13613629825124c026192e8_=_\r\nContent-Type:
multipart/related;\r\n oundary="_=_swift_v4_13613629825124c02620826_=_"\r\n\r\n--_
=_swift_v4_13613629825124c02620826_=_\r\n
From: 2013-01-11 04:26:07, To: 2013-01-11 05:56:08, Downtime: 1h 30m 01s\r\n\r\n
some more text here From: 2013-01-29 04:51:07, To: 2013-01-29 05:41:07, Downtime: 0h 50m
00s\r\n\r\n\r\n\r\n\r\n This is a scheduled report from If you wish to no longer receive
t=\r\nhis report you can unsubscribe by logging in to and u=\r\npdate your email report
settings.\r\nCopyright: 2013
""" #imported from where ever and however
from_loc = mystring.find("From: ")
dtime_right = mystring.find("\r\n",from_loc) #find the end of the line after downtime
msg = mystring[from_loc:dtime_right] #string splicing
>>>打印消息
来自:2013-01-11 04:26:07,收件人:2013-01-11 05:56:08,停机时间:1小时30分01秒
注意:如果您想出于某种原因保存在线上,可以将其压缩为1行:
msg = mystring[mystring.find("From: "):dtime_right = mystring.find("\r\n",from_loc = mystring.find("From: "))]
这真的凌乱,我不推荐它,但选项就在那里:P
答案 2 :(得分:0)
试试这个
import re
p = re.compile(ur'from:\s*([0-9\-\s:]+),\s*to:([0-9\-\s:]+),\s*downtime:\s*([0-9\shms]+)', re.MULTILINE | re.IGNORECASE)
test_str = u"AB\r\n\r\n--_=_swift_v4_13613629825124c026192e8_=_\r\nContent-Type: multipart/related;\r\n oundary=\"_=_swift_v4_13613629825124c02620826_=_\"\r\n\r\n--_ =_swift_v4_13613629825124c02620826_=_\r\n\nFrom: 2013-01-11 04:26:07, To: 2013-01-11 05:56:08, Downtime: 1h 30m 01s\r\n\r\n\nsome more text here From: 2013-01-29 04:51:07, To: 2013-01-29 05:41:07, Downtime: 0h 50m 00s\r\n\r\n\r\n\r\n\r\n This is a scheduled report from If you wish to no longer receive t=\r\nhis report you can unsubscribe by logging in to and u=\r\npdate your email report settings.\r\nCopyright: 2013 \n"
re.findall(p, test_str)