Question

我正在使用带有抓痒命令行的CrawlSpider。一切都很好：

scrapy抓取--nolog newproductcrawler

现在，我想使用CrawlerProcess，导入时item模块崩溃。

发生异常：ModuleNotFoundError

没有名为“ productsupervision”的模块

同样，在setting.py中，我启用了具有类似模块的管道，并且未加载该管道。

from productsupervision.responseitem import ResponseItem

startUp.py

 [...]
 process = CrawlerProcess(get_project_settings())
 process.crawl(NewproductcrawlerSpider ,url = 'http://www.example.com',domain='www.example.com' )
 process.start()

NewproductcrawlerSpider.py

import scrapy
from scrapy.linkextractors import LinkExtractor
from scrapy.spiders import CrawlSpider, Rule
from array import array
from productsupervision.responseitem import ResponseItem #EXCEPTION
class NewproductcrawlerSpider(CrawlSpider):
  name = 'newproductcrawler'

文件夹结构为（无法再粘贴img！; o（（）

+产品监督

++蜘蛛

+++ newproductcrawler.py（搜寻器）

+++ startUp.py

++ middlewares.py

++ pipelines.py

++ responseitem.py

++ settings.py

+ scrapy.cfg

我在寻找如何使用CrawlerProcess正确导入iteml模块的方法

Answer 1

好，发现： startUp.py必须在项目根目录下。与scrapy.cfg相同的文件夹

无法使用CrawlerProcess

1 个答案: