我正在尝试从Django
管理命令运行蜘蛛。
它可以工作,但不使用settings
项目中的scrapy
。
django_project/
django_project/
app1/
scraping/ # This is app but it has scrapy project inside too
scrapy_spider/
settings.py
spiders/
当我尝试在命令内指定settings
时,它返回:
ModuleNotFoundError: No module named 'scrapy_spider'
命令
import os
from django.core.management.base import BaseCommand
from scrapy.utils.project import get_project_settings
from twisted.internet import reactor, defer
from scrapy.crawler import CrawlerRunner
from scraping.scrapy_spider.spiders.autoscrape_index_spider import AutoScrapeIndexSpider
from scraping.scrapy_spider.spiders.autoscrape_spider import AutoScrapeSpider
class Command(BaseCommand):
def handle(self, *args, **options):
os.environ['SCRAPY_SETTINGS_MODULE'] = 'scraping.scrapy_spider.settings'
runner = CrawlerRunner(settings=get_project_settings())
@defer.inlineCallbacks
def crawl():
yield runner.crawl(AutoScrapeIndexSpider)
yield runner.crawl(AutoScrapeSpider)
reactor.stop()
crawl()
reactor.run()
您知道如何使其工作吗?