Question

我需要解析一个网站上的所有文章。这个网站有1000多家商店。要获得任何一篇文章，我需要一个cookie中的id_shop。我是通过Requests模块做到的为了获得所有1000多个id_shops，我需要解析Ajax表单。然后我以这种方式为每个商店运行1000多只蜘蛛：

def setup_crawler(domain):
    spider = MySpider(domain=domain)
    settings = get_project_settings()
    crawler = Crawler(settings)
    crawler.configure()
    crawler.crawl(spider)
    crawler.start()

所以我有.py脚本执行所有这些步骤，我按python MySpider.py运行它。一切正常。问题是：我不能与另一个蜘蛛同时运行我的蜘蛛。我遵守该规则（此处列出http://doc.scrapy.org/en/latest/topics/practices.html）：

for domain in ['scrapinghub.com', 'insophia.com']:
    setup_crawler(domain)
log.start()
reactor.run()

我使用MySpider.run（）而不是setup_crawler（）。我得到MySpider等待其他人。我有两个问题： 1.如何与另一个人同时运行MySpider？ 2.我想从ajax解析id_shops并为一个蜘蛛中的每个id_shop运行1000多个蜘蛛。有可能吗？

Scrapy解析器。有点复杂

0 个答案: