使用Scrapy请求获取403错误

时间:2018-01-09 21:15:32

标签: python request scrapy

当我将Python请求模块用于以下HTTP请求时,它会返回一个完全符合我需要的字典:

import requests

payload = {'x-algolia-application-id':'Q0TMLOPF1J','x-algolia-api-key':'30a0c84a152d179ea8aa1a7a59374d08', 'hitsPerPage':'40', 'numericFilters': ['startdate > 1511095966851'],'facets': '*' }  

url = 'https://q0tmlopf1j-3.algolianet.com/1/indexes/sitecore-events'

r = requests.get(url, params=payload).json()

然而,当我尝试将其作为scrapy请求实现时,我可以解析结果:

def start_requests(self):
    payload = {'x-algolia-application-id':'Q0TMLOPF1J','x-algolia-api-key':'30a0c84a152d179ea8aa1a7a59374d08', 'hitsPerPage':'40', 'numericFilters': ['startdate > 1511095966851'],'facets': '*' }  

    url = 'https://q0tmlopf1j-3.algolianet.com/1/indexes/sitecore-events'

    yield scrapy.Request(url,
                                   body=json.dumps(payload), 
                                   method='GET',
                                   callback=self.parse_item)

def parse_item(self,response):
    # I want to parse the dict here

我收到403错误。我知道有一些简单的我做错了,它是什么?

1 个答案:

答案 0 :(得分:-1)

我知道你已经解决了"通过省略参数来解决问题,但正确的方法是使用FormRequest

yield scrapy.FormRequest(
    url=url,
    method='GET',
    formdata=payload,
    callback=self.parse_item
)