Question

我正在运行docker + python + spyder

我的spyder的运行与我的并发限制（idk）一样多，有人可以帮助我理解它吗？

我的docker-compose.py

celery:
    build:
      context: .
      dockerfile: ./celery-queue/Dockerfile
    entrypoint: celery
    command: -A tasksSpider worker --loglevel=info  --concurrency=5 -n myuser@%n
    env_file:
    - .env
    depends_on:
    - redis

我的蜘蛛码：

def spider_results_group():
    results = []

    def crawler_results(signal, sender, item, response, spider):
        results.append(item)

    dispatcher.connect(crawler_results, signal=signals.item_passed)

    process = CrawlerProcess(get_project_settings())
    process.crawl(groupSpider)
    process.start()  # the script will block here until the crawling is finished
    process.stop()
    return results

使用此代码，我可以运行蜘蛛多次，但只有5次，当我检查它时，我认为这是因为我的同步性仅为5，而当再次运行（第6次）时，它卡住了..

如果需要其他代码，请询问

Answer 1

使用以下命令解决：

    command: -A tasksSpider worker --loglevel=info  --concurrency=5 --max-tasks-per-child=1 -n myuser@%n

从获得答案： Running Scrapy spiders in a Celery task

docker scrapy spyder无法自动重启

1 个答案: