无法在scrapy中将unicode转换为json

时间:2016-12-03 12:27:52

标签: python-2.7 web-scraping scrapy screen-scraping scrapy-spider

import scrapy
import json
class GettingtonDSpider(scrapy.Spider):
    name = "gettington_d"
    allowed_domains = ["gettington.com"]
    start_urls = ['https://api.gettington.com/v1/products?showMPP=false&rows=24&q=Keyword:south%20shore%20furniture&productfilter=null&callback=searchCallback']
    def parse(self, response):
    jsonresp = json.dumps(response.body)
    jsonresp= json.loads(jsonresp)

我尝试了很多方法但是我失败了:

  • response.text
  • 编码( 'UTF-8')
  • response_body_as_unicode

以上都不奏效。如何解决错误?

1 个答案:

答案 0 :(得分:0)

您必须先从response.body删除不必要的信息,这不是JSON可序列化的:

import re

    ...
    json_string = re.search(r'searchCallback\((.*)\)', response.body).group(1);
    jsonresp = json.loads(json_string)

现在dict

中有jsonresp