方案:我有一个抓取脚本,用于抓取该网站。在抓取的详细信息中找到所需的关键字后,它将发送邮件。有一个站点每30分钟更改一次数据,我需要在指定关键字后再次抓取并发送电子邮件(如果找到)。我该如何每隔30分钟在scrapy python中循环。
代码:
# -*- coding: utf-8 -*-
import scrapy
from scrapy.http import Request
import smtplib
from email.mime.text import MIMEText
import time
class NewFilmSpiderSpider(scrapy.Spider):
name = 'new_film_spider'
allowed_domains = ['www.xxx.in']
start_urls = ['https://www.xxx.in/xxx/now-showing']
def parse(self, response):
t = threading.Thread(self.getDetails(response))
t.start()
def getDetails(self, response):
FROM_ADDRESS = 'xxx@gmail.com'
PASSWORD = 'xxx'
TO_ADDRESS= 'xxx@gmail.com'
HOST='smtp.gmail.com'
PORT=587
records = response.xpath('//section[@class="main-section"]/section[2]/section[@class="movie__listing now-showing"]/ul/li/div/dl/dt/a/text()').extract()
if 'KEYWORD' in str(records):
receivers = [TO_ADDRESS]
msg="Booking Opened"
try:
smtpObj = smtplib.SMTP(HOST,PORT)
smtpObj.set_debuglevel(1)
smtpObj.ehlo()
smtpObj.starttls()
smtpObj.login(FROM_ADDRESS,PASSWORD)
smtpObj.sendmail(FROM_ADDRESS, receivers, msg)
smtpObj.quit()
print "Successfully sent email"
except Exception as e:
print "Error: unable to send email"
time.sleep(60) #checking for every minute
此代码运行脚本并发送邮件。我不知道该如何循环播放。任何线索都将有所帮助。谢谢。
更新#1: 我尝试了穿线。如答案中所给。但程序会在两个循环后停止。
更新#2: 我忘了添加While。有效
答案 0 :(得分:2)
您可以生成一个每30分钟运行一次的线程,如下所示:
import threading
def __init__(self):
. . .
t = threading.Thread(self.every_thirty_min())
t.start()
def every_thirty_min(self):
while True:
print('up')
// do stuff
time.sleep(1800) // 30 min