使用scrapy处理“加载更多”按钮

时间:2016-09-11 04:14:39

标签: python scrapy

我正在研究ratemyprofessor,它有一个加载更多按钮来加载更多教授,我使用调试器来分析网络,它显示了一个js请求。 ratemyprofessorwebsite

我在想,对于请求URL,有一个开始和行,所以我每次只增加20开始,这会有效吗?

有人告诉我,我可以尝试使用formdata,但在这种情况下,没有formdata,它不是POST方法,我是对的吗?

我是scrapy和python的新手,希望你们能给我一些见解。真的很感激

他们不允许我上传图片......但无论如何

Request URL:https://search-a.akamaihd.net/typeahead/suggest/?solrformat=true&rows=10&callback=noCB&q=*%3A*+AND+schoolid_s%3A1273&defType=edismax&qf=teacherfullname_t%5E1000+autosuggest&bf=pow(total_number_of_ratings_i%2C2.1)&sort=total_number_of_ratings_i+desc&siteName=rmp&rows=20&start=20&fl=pk_id+teacherfirstname_t+teacherlastname_t+total_number_of_ratings_i+averageratingscore_rf+schoolid_s
Request Method:GET
Status Code:200 OK
Remote Address:23.212.53.206:443

QUERY STING PARAMETERS

solrformat:true
rows:10
callback:noCB
q:*:* AND schoolid_s:1273
defType:edismax
qf:teacherfullname_t^1000 autosuggest
bf:pow(total_number_of_ratings_i,2.1)
sort:total_number_of_ratings_i desc
siteName:rmp
rows:20
start:20
fl:pk_id teacherfirstname_t teacherlastname_t total_number_of_ratings_iaverageratingscore_rf schoolid_s

1 个答案:

答案 0 :(得分:0)

是的,你是对的 - 不需要formdata,这可以直接使用GET方法调用。

使用以下参数,使用以下命令获取json响应:

data = json.loads(response.body)
records = data['response']['docs'] 

要使用的参数:开始,行

要避免的参数:回调

参数说明:

solrformat:true
#callback:noCB # Drop this parameter, this will give you response data in json format
q:*:* AND schoolid_s:1273
defType:edismax
qf:teacherfullname_t^1000 autosuggest
bf:pow(total_number_of_ratings_i,2.1)
sort:total_number_of_ratings_i desc
siteName:rmp
rows:20 
# rows: the amount of records you want in a single response, this will give you 20
# you can try with maximum that can be returned in single response i.e 500, 1000..etc, this will minimize no of request made to website 
start:20  
#start :you can start this from 0,10,20,30,40 manually as well if you know the total number of records.
fl:pk_id teacherfirstname_t teacherlastname_t total_number_of_ratings_iaverageratingscore_rf schoolid_s