尝试从以下格式中提取电子邮件地址:
John Smith <jsmith@email.com>
我已经尝试了以下两种情况并导致同样的错误:
IndexError: list index out of range
email_address = re.findall('(?<=\<)\w+@[a-zA-Z]+\.[a-z]+(?=\>)', sender)[0]
email_address = re.findall('<([^>])>', sender)[0]
其余代码:
import webapp2
import logging
from google.appengine.ext.webapp import mail_handlers
from google.appengine.api import mail
import os
from main import WorkRequest
import re
class IncomingMailHandler(mail_handlers.InboundMailHandler):
def receive(self, message):
(encoding, payload) = list(message.bodies(content_type='text/plain'))[0]
body_text = payload.decode()
logging.info('Received email message from %s, subject "%s": %s' %
(message.sender, message.subject, body_text))
logging.info (message.sender)
logging.info(message.subject)
logging.info(body_text)
sender = str(message.sender)
email_address = re.findall('<([^>])>', sender)[0]
wr = WorkRequest()
wr.email = email_address
wr.userId = None
wr.title = message.subject
wr.content = body_text
wr.status = "OPEN"
wr.submission_type = "EMAIL"
wr.assigned_to = "UNASSIGNED"
wr.put()
application = webapp2.WSGIApplication([('/_ah/mail/.+', IncomingMailHandler)],debug=True)
有人可以帮忙吗?如果重要的话,我正在使用带有Python的Google App Engine。
答案 0 :(得分:1)
在我的情况下,第一个正则表达式工作正常:
>>> sender = 'John Smith <jsmith@email.com>'
>>> email_address = re.findall('(?<=\<)\w+@[a-zA-Z]+\.[a-z]+(?=\>)',
sender)[0]
>>> email_address
'jsmith@email.com'
第二个是无效的,因为你得到一个空列表,所以你不能在索引0获得项目:
email_address = re.findall('<([^>])>', sender)
>>> email_address
[]
您可以在http://rubular.com/查看正则表达式 它免费且易于使用。