当我在scrapy中手动运行我的蜘蛛时,它第一次执行代码但给我0结果。然而,当我第二次运行它时,它会完美地爬行。我手动执行此操作时很好,但是当我在crontab中运行它时,它不会产生任何结果。我明白了(我删除了时间数据):
{'downloader/request_bytes': 221,
'downloader/request_count': 1,
'downloader/request_method_count/GET': 1,
'downloader/response_bytes': 116972,
'downloader/response_count': 1,
'downloader/response_status_count/200': 1,
'finish_reason': 'finished',
'finish_time': datetime.datetime(xxx, x, xxx, xx, xx, xx, xxxx),
'log_count/DEBUG': 2,
'log_count/INFO': 7,
'log_count/WARNING': 1,
'response_received_count': 1,
'scheduler/dequeued': 1,
'scheduler/dequeued/memory': 1,
'scheduler/enqueued': 1,
'scheduler/enqueued/memory': 1,
当我手动运行时,我收到9个结果:
{'downloader/request_bytes': 4696,
'downloader/request_count': 10,
'downloader/request_method_count/GET': 10,
'downloader/response_bytes': 202734,
'downloader/response_count': 10,
'downloader/response_status_count/200': 10,
'dupefilter/filtered': 9,
'finish_reason': 'finished',
'finish_time': datetime.datetime(xxx, x, xx, xx, xx, xx, xxxxxx),
'item_scraped_count': 9,
'log_count/DEBUG': 21,
'log_count/INFO': 8,
'log_count/WARNING': 1,
'request_depth_max': 2,
'response_received_count': 10,
'scheduler/dequeued': 10,
'scheduler/dequeued/memory': 10,
'scheduler/enqueued': 10,
'scheduler/enqueued/memory': 10,
我错了什么?
而且,如果我在一分钟内第二次运行相同的crontab作业,它会产生结果吗?如果是这样,我该怎么做?
答案 0 :(得分:-1)
你能展示完整的命令,你如何在cron中运行? 还尝试将-L DEBUG添加到抓取命令以查看更多内容。