import scrapy
import json
class GettingtonDSpider(scrapy.Spider):
name = "gettington_d"
allowed_domains = ["gettington.com"]
start_urls = ['https://api.gettington.com/v1/products?showMPP=false&rows=24&q=Keyword:south%20shore%20furniture&productfilter=null&callback=searchCallback']
def parse(self, response):
jsonresp = json.dumps(response.body)
jsonresp= json.loads(jsonresp)
我尝试了很多方法但是我失败了:
以上都不奏效。如何解决错误?
答案 0 :(得分:0)
您必须先从response.body
删除不必要的信息,这不是JSON可序列化的:
import re
...
json_string = re.search(r'searchCallback\((.*)\)', response.body).group(1);
jsonresp = json.loads(json_string)
现在dict
jsonresp