如何使用scrapyd开始每周或每月运行蜘蛛?

时间:2017-07-27 11:41:17

标签: scrapy-spider scrapyd

作为标题

安排每月或每周蜘蛛运行任务

以及如何使用设置参数

Parameters:
project (string, required) - the project name
spider (string, required) - the spider name
setting (string, optional) - a Scrapy setting to use when running the spider
jobid (string, optional) - a job id used to identify the job, overrides the 

1 个答案:

答案 0 :(得分:0)

您可以利用crontab来安排每周和每月的蜘蛛。在scrapyd中部署scrapy项目后,您可以使用此函数在crontab中使用python包创建一个条目: - python-crontab

from crontab import CronTab
def set_cron_job(project_name, spider_name, log_path, run_type, hours, minutes, dow, dom, month):
    my_cron = CronTab(user='xyz')
    cmd = 'curl http://localhost:6800/schedule.json -d project='+ project_name + ' -d spider='+ spider_name + ' > '+ log_path + ' 2>&1'
    job = my_cron.new(command=cmd)
    if run_type.lower() == "daily":
        job.minute.on(minutes)
        job.hour.on(hours)
    elif run_type.lower() == "weekly":
        job.dow.on(dow)    
        job.minute.on(minutes)
        job.hour.on(hours)
    elif run_type.lower() == "monthly":
        job.dom.on(dom)    
        job.minute.on(minutes)
        job.hour.on(hours)
    elif run_type.lower() == "yearly":
        job.dom.on(dom)
        job.month.on(month)
        job.minute.on(minutes)
        job.hour.on(hours)        
    my_cron.write()
    print my_cron.render()