Question

看一下scrapy和scrapyD的文档，看来可以写入scrape结果的唯一方法是在蜘蛛本身的管道中编写代码。我的同事告诉我，还有另一种方法可以使我从scrapyD内部截取刮擦的结果！！

有人听说过吗？如果可以，有人可以帮我阐明一下吗？

谢谢

item exporters

feed exports

scrapyd config

Answer 1

Scrapyd确实是一项服务，可用于通过JSON API计划Scrapy应用程序的爬网过程。它还允许将Scrapy与不同的框架（例如Django）集成，如果您感兴趣的话，请参见this guide。

这里是Scrapyd的documentation。

但是，如果您怀疑要保存刮取的结果，则标准方法是在Scrapy应用程序的pipelines.py文件中进行保存。

一个例子：

class Pipeline(object):
    def __init__(self):
        #initialization of your pipeline, maybe connecting to a database or creating a file

    def process_item(self, item, spider):
        # specify here what it needs to be done with the scraping result of a single page

记住要定义您在Scrapy应用程序settings.py中使用的管道：

ITEM_PIPELINES = {
   'scrapy_application.pipelines.Pipeline': 100,
}

来源：https://doc.scrapy.org/en/latest/topics/item-pipeline.html

ScrapyD中是否有管道概念？

1 个答案: