Scrapy不会发出POST请求

时间:2016-10-08 22:16:25

标签: python ajax scrapy

我写的Scrapy蜘蛛应该用AJAX处理一些网站。理论上它应该工作正常,而且当我在Scrapy shell中使用fetch()手动使用它时,它可以正常工作,但是当我运行" scrapy crawl ..."我没有在日志中看到任何POST请求,也没有任何项目被删除。它可能是什么,问题的根源是什么?

2016-10-09 01:11:16 [scrapy] INFO: Dumping Scrapy stats:
{'downloader/exception_count': 9,
 'downloader/exception_type_count/twisted.internet.error.DNSLookupError': 1,
 'downloader/exception_type_count/twisted.internet.error.TimeoutError': 8,
 'downloader/request_bytes': 106652,
 'downloader/request_count': 263,
 'downloader/request_method_count/GET': 263,
 'downloader/response_bytes': 5644786,
 'downloader/response_count': 254,
 'downloader/response_status_count/200': 252,
 'downloader/response_status_count/301': 1,
 'downloader/response_status_count/302': 1,
 'dupefilter/filtered': 19,
 'finish_reason': 'finished',
 'finish_time': datetime.datetime(2016, 10, 8, 22, 11, 16, 949472),
 'log_count/DEBUG': 265,
 'log_count/INFO': 11,
 'request_depth_max': 3,
 'response_received_count': 252,
 'scheduler/dequeued': 263,
 'scheduler/dequeued/memory': 263,
 'scheduler/enqueued': 263,
 'scheduler/enqueued/memory': 263,
 'start_time': datetime.datetime(2016, 10, 8, 22, 7, 7, 811163)}
2016-10-09 01:11:16 [scrapy] INFO: Spider closed (finished)

日志是:

http://www.spoilertv.com/feeds/posts/default/-/Reviews?start-index=501
http://www.spoilertv.com/feeds/posts/default/-/Reviews?start-index=751
http://www.spoilertv.com/feeds/posts/default/-/Reviews?start-index=1001
....
....
http://www.spoilertv.com/feeds/posts/default/-/Reviews?start-index=10001

1 个答案:

答案 0 :(得分:1)

parseProdPage方法中未使用parseCat方法返回的请求。你应该从屈服开始:yield self.parseProdPage(response)

此外,您可能希望在同一请求中设置dont_filter=True,否则大部分将被过滤掉(因为它们都具有相同的网址)。