下面是我在Scrapy日志中抛出400错误的代码。我在这段代码背后的逻辑如下 - 1)我使用post请求获取我的Secret_Token。 2)我设置我的标头使用秘密令牌并定义API搜索字符串的参数。另外我认为带有Secret_token的标题应作为元数据传递给进一步的请求。 3)这里我希望Parse函数从#2中的Request接收json响应并将其解析为项目。在Parse方法中的循环之后,其中包含准备好和正在处理的请求#2的参数列表。
问题是它不起作用)附加日志。我想知道我是否正确传递参数和秘密令牌,我如何在元中传递秘密令牌?
import scrapy
import json
import requests
import pprint
class YelpSpider(scrapy.Spider):
name = "yelp"
allowed_domains = ["https://api.yelp.com"]
def start_requests(self):
params = {
'grant_type': 'client_credentials',
'client_id': '*******',
'client_secret' : '*******'
}
request = requests.post('https://api.yelp.com/oauth2/token', params = params)
bearer_token = request.json()['access_token']
headers = {'Authorization' : 'Bearer %s' % bearer_token}
params = {
'term': 'restaurant',
'offset': 20,
'cc' : 'AU',
'location': 4806
}
yield scrapy.Request('https://api.yelp.com/v3/businesses/search', headers = headers, cookies = params, callback= self.parse)
def parse(self, response):
item = response.json()['businesses']
return item
答案 0 :(得分:0)
是的,您可以使用scrapy
完全执行此操作,但它不会将python库用作API客户端,而是需要执行{{3}中指定的直接请求}。
答案 1 :(得分:0)
以下是使用Scrapy的Yelp Fusion API的完整功能代码。我还没有实现基于邮政编码和偏移参数的url生成逻辑来显示多达1000个条目。加上实施项目。如果您对如何改进代码有一些建议,请发表您的意见。
P.S。顺便说一下,Fusion API将显示结果的限制增加到50.所以现在你可以使用# -*- coding: utf-8 -*-
import scrapy
import json
import urllib
class YelpSpider(scrapy.Spider):
name = "yelp"
def start_requests(self):
# as per Yelp docs we pass personal info as POST to get access_token
# here a pass it to different function as do not know how to to all in one
params = {
'grant_type': 'client_credentials',
'client_id': '**********',
'client_secret' : '************'
}
yield scrapy.Request(url='https://api.yelp.com/oauth2/token', method="POST", body=urllib.urlencode(params))
def parse(self, response):
# revoke access token from response object. and set Header according to Yelp docs.
bearer_token = json.loads(response.body)['access_token']
headers = {'Authorization' : 'Bearer %s' % bearer_token}
# set search parameters
params = {
'term': 'restaurant',
'offset': 20,
'cc' : 'AU',
'location': 4806
}
# base search URL for Fusion API
url = "https://api.yelp.com/v3/businesses/search"
# form Get request to recieve final info as JSON. Unfortunately I did not find appropriate
# method to pass params in Scrapy other then shown below.
yield scrapy.Request(url= url + '?' + urllib.urlencode(params), method="GET", headers=headers, callback=self.parse_items)
def parse_items(self, response):
# parse needed items.
resp = json.loads(response.body)['businesses']
print resp
property