我遇到了价值错误:
raise ValueError('Missing scheme in request url: %s' % self._url)
ValueError: Missing scheme in request url: h
我的items.py代码为:
class Brand(scrapy.Item):
name = scrapy.Field()
url = scrapy.Field()
brand_image = scrapy.Field()
image_urls = scrapy.Field()
images = scrapy.Field()
我的setting.py是:
BOT_NAME = 'scraper'
SPIDER_MODULES = ['scraper.spiders']
NEWSPIDER_MODULE = 'scraper.spiders'
ITEM_PIPELINES = {'scrapy.contrib.pipeline.images.ImagesPipeline': 1}
IMAGES_STORE = 'images'
我的蜘蛛码:
import scrapy
import json
from scraper.items import Brand
class QuotesSpider(scrapy.Spider):
name = "brandDetails"
allowed_domains = ["ozhat-turkiye.com"]
with open('brands.json') as data_file:
data_item = json.load(data_file)
start_urls = list()
for item in data_item:
start_urls.append(item["url"])
def parse(self, response):
item = Brand()
name = response.css("div.th::text").extract_first()
name = name.replace('Products of ', '')
item['name'] = name
item['url'] = response.url
urls = response.css("div.productimage img::attr(src)").extract_first()
urls = response.urljoin(urls)
item['image_urls'] = urls
yield item
答案 0 :(得分:1)
Missing scheme in request url
始终表示您的URL无效,缺少http://
和https://
因此,在您拥有的图像网址之前加https://
或http://
`https://` + response.css("div.productimage img::attr(src)").extract_first()