无法重新创建卷曲请求

时间:2017-01-04 10:53:06

标签: python scrapy

我试图从https://lajumate.ro/ajax/phone-number获取电话号码(example product page

该页面需要具有特定cookie和数据的POST机制。 CURL请求的示例如下所示:

curl 'https://lajumate.ro/ajax/phone-number'  -H 'Cookie:  XSRF-TOKEN=eyJpdiI6IkVqUEwrZjU2UDJOaFB3NDl6b0xhTFE9PSIsInZhbHVlIjoiemJnTCt3S0UxTjUwVjVTQk0xUzlRSVpNZGVPM0dIVHBcL1JlYTVabmcxelFPNG5ZZ1d4NGYxbmpnRTAxeVRSaWRcLzZTUVhRVzlNcmtyOHJvcWFOdlE3UT09IiwibWFjIjoiYmUzOGNkNDlkMjMyNzY3YTQxNzE0ZWEwNmJhMDExZWUzODdmZmU5MmZmMTEwODk1ZTE3ZjYxNTkxZjYyNzFkOCJ9; ljs= eyJpdiI6ImdYR28xcnZvSXFiNHpSekVyeHJOQVE9PSIsInZhbHVlIjoiSnJtTlBRMmRJY1ZqNUtxWXdPREdlYnptc3pKWGRmZ1ppdjdCc0lcL040NzlDbytTcWNZb1Bwa0kyejlKM3NmNGZ0dDMwcFNhaXZ6WHlWSExFaHlNYnFnPT0iLCJtYWMiOiI4ZWY2MzRiNTY5Mjc3M2FmYjllNDJiODEyYWRmNzUxNjViYWM0OTIyZjQ3OTRjODhiMjM3N2NlNTJjYWJiNTRiIn0;'  --data '_token=lT8dwMv5vqGrnh0drb6pW7sreYjguJn5qaCXZIck&ad_id=3834372' --compressed

这有效(请注意,Cookie和令牌过期)。所以我创建了一个蜘蛛来重新创建这个请求。代码如下所示:

    req = FormRequest(
        'https://lajumate.ro/ajax/phone-number', 
        callback=self.parse_phone, 
        formdata={'ad_id':re.sub(r".+?(\d+)\.html",r"\1",response.url),'_token':response.xpath('//input[@name="_token"]/@value').extract()[0]}, 
        headers={'Cookie':'ljs='+ljs+';XSRF-TOKEN='+XSRF},
        dont_filter=True
        )

ljs和XSRF是从响应cookie中提取的。

我使用两个调试记录器来检查请求:

    self.logger.debug('Request headers: %s', dict(req.headers))
    self.logger.debug('Request body: %s', req.body)

导致:

  

2017-01-04 11:44:41 [lajumate-sellers] DEBUG:请求标题:   {'曲奇&#39 ;:   [' LJS = eyJpdiI6IlBTU05tWlV0NW1DZGJaZk5nemEzTUE9PSIsInZhbHVlIjoiY3JDNFR2clpkMGVaNHVqODZFT2NvTmFRb1BKRmZCS0pCRndwd0xNNXVzV2M1WUNCUm5MWXFnbEU5RGZkQnVRNHFNMFp5S0E4TllkZXVtNk5cL3JSU1FBPT0iLCJtYWMiOiJhOWRhYmJmODg1NzcwOTRhNzQ5ZTlhNDg4OTEzZWNiNDc5NDhlNzZmMmQ3MDliYjM0ODlkZDAwOTYzN2NkNTkzIn0%3D; XSRF-TOKEN = eyJpdiI6ImNHNzZhbVViNWxTUm16bmg5amF0SFE9PSIsInZhbHVlIjoiWGRPMWFjVFBPTFNYWkxrNjI2THJIYU1KeStLcTg4Z3FFRkFqOWJjMDdHNUJKNXFuY2pKVXkxTVpuT1ExNXpSZWZHM1FPMzRjSTY0R3lSVndJME1GMFE9PSIsIm1hYyI6ImQwYThlMGQzYzA3NjA3YmE2ZTAwYjA0NjRiNzRjNTY4NGVlNjEwZjUxMzFiMWE0OGI3Nzk5YWVlNmVkODllNGEifQ%3D%3D&#39],   '内容类型':[' application / x-www-form-urlencoded']}

     

2017-01-04 11:44:41 [lajumate-sellers] DEBUG:请求正文:   _token = VtCrPpqMwpcO1FRCZ12pnYmXj7Bv14B8o4aRcZyA&安培; ad_id = 3576651

这一切看起来都应该如此。但是当蜘蛛试图加载页面时,它会使用302状态代码重定向请求。

但是,当我将调试数据复制粘贴到curl命令或hurl.it时,我能够获取数据。

有关如何解决此问题的任何建议吗?

0 个答案:

没有答案