Scrapy:将每个项目的图像下载到动态文件夹中,然后裁剪这些图像

时间:2018-10-30 09:30:33

标签: image scrapy crop scrapy-pipeline

我正在尝试找到一种方法,将每个项目的图像下载到单独的文件夹中,并根据项目字段设置文件夹名称。

我找到了动态文件夹的解决方案,并且在这里可以正常使用: How to download scrapy images in to a dynamic folder? 1 midodesign

提供的解决方案

def item_completed(self, results, item, info):

    for result in [x for ok, x in results if ok]:
        path = result['path']
        slug = slugify(item['designer'])


        settings = get_project_settings()
        storage = settings.get('IMAGES_STORE')

        target_path = os.path.join(storage, slug, os.path.basename(path))
        path = os.path.join(storage, path)

        # If path doesn't exist, it will be created
        if not os.path.exists(os.path.join(storage, slug)):
            os.makedirs(os.path.join(storage, slug))

        shutil.move(path, target_path)

    if self.IMAGES_RESULT_FIELD in item.fields:
        item[self.IMAGES_RESULT_FIELD] = [x for ok, x in results if ok]
    return item

我还找到了用于裁剪下载图像here

的代码

def item_completed(self, results, item, info):
    image_paths = [x['path'] for ok, x in results if ok]
    if not image_paths:
        raise DropItem("Item contains no images")

    if item['refer'] == 'someurl.com' :
        for a in image_paths:
            o_img = os.path.join(self.store.basedir,a)

            if os.path.isfile(o_img):
                image = Image.open(o_img)
                x,y = image.size
                if(y>120):
                    image = image.crop((0,0,x,y-35))
                    image.save(o_img,'JPEG');

    return item

我要解决的问题是将两个示例混合到一个函数中,该函数将图像存储到动态生成的文件夹中,然后在该文件夹中裁剪图像。

我尝试了几种解决方案,但无法使它们协同工作。

谢谢!

0 个答案:

没有答案