如何在将新文章发布到网页时发送电子邮件?

时间:2016-12-17 02:58:14

标签: python email

我编写了一个脚本,在发布日期文章时抓取http://sf.eater.com/the-shutter,然后通过电子邮件向我发送最新文章的日期。虽然这很棒(特别是我让它工作!),理想的情况是只能在发布以前看不见的(即新的)文章时发送电子邮件。

这是我写的:

# Import requests (to download the page)
import requests

# Import BeautifulSoup (to parse what we download)
from bs4 import BeautifulSoup

# Import Time (to add a delay between the times the scape runs)
import time

# Import smtplib (to allow us to email)
import smtplib

# Import regular expressions
import re 

import urllib2

import sys

from email.MIMEMultipart import MIMEMultipart

from email.MIMEText import MIMEText

#-----------------------------------------------------------

#scrape the page
url = "http://sf.eater.com/the-shutter"
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}
response = requests.get(url, headers=headers)

#parse the HTML
soup = BeautifulSoup(response.text, "html.parser")

#create an empty list
dates = []

#iterate through the parsed HTML and extract dates
for span_tag in soup.find_all("span"):
    if 'am' in span_tag.text:
        dates.append(span_tag.text)
    if 'pm' in span_tag.text:
        dates.append(span_tag.text)


# email the results
#sending addres
fromaddr = "<insert from add here>" 
#to address
toaddr = "<insert to add here>" 
msg = MIMEMultipart()
msg['From'] = fromaddr
msg['To'] = toaddr
#subject of the email
msg['Subject'] = "Shutter Article Check" 

 #body of the email
body = "The most recent Shutter article page for SF Eater was posted on" + dates[0] + "Here is a link:  http://sf.eater.com/the-shutter" 
msg.attach(MIMEText(body, 'plain'))

server = smtplib.SMTP('smtp.gmail.com', 587)
server.starttls()
server.login(fromaddr, "<insert password here>") # username/password
text = msg.as_string()
server.sendmail(fromaddr, toaddr, text)
server.quit()

基本上检查每个解析后的字符串是否为am / pm,如果是,则将它们放入列表中,然后将列表的第一项通过电子邮件发送到我选择的电子邮件地址。

如果在添加新日期时,如果变量在脚本运行之间没有保留其值,我怎么才能发送电子邮件?

0 个答案:

没有答案