我正在抓取网络(使用Python请求和requests-html模块),我需要浏览项目列表的所有页面。
在“人类用户”世界中,我单击第二页的“ 2”,或单击“->”以从实际页面转到下一页。
当我检查刚刚说过的元素时,它们是一个<div>
标签,例如:
<div class="pagination__Page..."> 2 </div>
或
<div class="pagination__Page..."> -> </div>
两者都链接有event
,因此当我单击它时,将移至下一页。
我已尝试执行request-HTML文档建议的for循环分页,但在这种情况下不起作用,因为没有链接与r.html
对象或页面的每个页面相关联清单。
当我在网站上单击这些“ div”时,URL根本不会更改。
检查event
(对于2
而言)会调用JS函数,例如:
function() {
return a({
pageNum: e
})
}
检查event
函数(对于->
而言)会调用JS,例如:
function() {
return a({
direction: "right"
})
}
我希望获得与单击时相同的结果,但是我不知道如何。
答案 0 :(得分:0)
您将不得不使用dev工具来获取精确的查询参数(特别是rqid
),但这可以助您一臂之力。它将返回完整列表,而无需逐页浏览:
import requests
from pandas.io.json import json_normalize
url = 'https://www.flightstats.com/v2/api-next/flight-tracker/arr/ORY/2019/4/29/6'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36'}
query = {
'carrierCode': '',
'numHours': '6',
'rqid': '7tl8o43bkps'}
jsonData = requests.get(url, headers=headers, params=query).json()
df = json_normalize(jsonData['data']['flights'])
输出:
print (df)
airport.city ... url
0 Cayenne ... /flight-tracker/TX/571?year=2019&month=4&date=...
1 Saint Denis de la Reunion ... /flight-tracker/AF/671?year=2019&month=4&date=...
2 Pointe-a-Pitre ... /flight-tracker/SS/3541?year=2019&month=4&date...
3 Pointe-a-Pitre ... /flight-tracker/TX/541?year=2019&month=4&date=...
4 Moscow ... /flight-tracker/S7/4021?year=2019&month=4&date...
5 Moscow ... /flight-tracker/ZI/516?year=2019&month=4&date=...
6 Cayenne ... /flight-tracker/AF/853?year=2019&month=4&date=...
7 Cayenne ... /flight-tracker/KL/2245?year=2019&month=4&date...
8 Toulouse ... /flight-tracker/AF/6101?year=2019&month=4&date...
9 Pointe-a-Pitre ... /flight-tracker/KL/2261?year=2019&month=4&date...
10 Toulouse ... /flight-tracker/HOP/5101?year=2019&month=4&dat...
11 Pointe-a-Pitre ... /flight-tracker/AF/793?year=2019&month=4&date=...
12 Beirut ... /flight-tracker/SS/6628?year=2019&month=4&date...
13 Beirut ... /flight-tracker/ZI/628?year=2019&month=4&date=...
14 Montpellier ... /flight-tracker/AF/7541?year=2019&month=4&date...
15 Geneva ... /flight-tracker/U2/1399?year=2019&month=4&date...
16 Montpellier ... /flight-tracker/HOP/5541?year=2019&month=4&dat...
17 Ajaccio ... /flight-tracker/AF/4442?year=2019&month=4&date...
18 Bastia ... /flight-tracker/HOP/7780?year=2019&month=4&dat...
19 Ajaccio ... /flight-tracker/HOP/7770?year=2019&month=4&dat...
20 Ajaccio ... /flight-tracker/XK/770?year=2019&month=4&date=...
21 Bastia ... /flight-tracker/XK/780?year=2019&month=4&date=...
22 Bastia ... /flight-tracker/AF/4458?year=2019&month=4&date...
23 Marseille ... /flight-tracker/HOP/5001?year=2019&month=4&dat...
24 Marseille ... /flight-tracker/AF/6001?year=2019&month=4&date...
25 Clermont-Ferrand ... /flight-tracker/AF/7433?year=2019&month=4&date...
26 Clermont-Ferrand ... /flight-tracker/HOP/5433?year=2019&month=4&dat...
27 Bordeaux ... /flight-tracker/HOP/5253?year=2019&month=4&dat...
28 Bordeaux ... /flight-tracker/AF/6253?year=2019&month=4&date...
29 Nice ... /flight-tracker/HOP/5203?year=2019&month=4&dat...
.. ... ... ...
192 Marseille ... /flight-tracker/HOP/5009?year=2019&month=4&dat...
193 Sevilla ... /flight-tracker/TO/3201?year=2019&month=4&date...
194 Bordeaux ... /flight-tracker/AF/6277?year=2019&month=4&date...
195 Toulouse ... /flight-tracker/U2/4026?year=2019&month=4&date...
196 Toulouse ... /flight-tracker/HOP/5117?year=2019&month=4&dat...
197 Toulouse ... /flight-tracker/AF/6117?year=2019&month=4&date...
198 Rome ... /flight-tracker/IB/5193?year=2019&month=4&date...
199 Rome ... /flight-tracker/VY/6251?year=2019&month=4&date...
200 Bordeaux ... /flight-tracker/HOP/5277?year=2019&month=4&dat...
201 Faro ... /flight-tracker/U2/4278?year=2019&month=4&date...
202 Campinas ... /flight-tracker/AD/8900?year=2019&month=4&date...
203 Casablanca ... /flight-tracker/AT/760?year=2019&month=4&date=...
204 Campinas ... /flight-tracker/ZI/36?year=2019&month=4&date=2...
205 Rome ... /flight-tracker/U2/4242?year=2019&month=4&date...
206 Ajaccio ... /flight-tracker/XK/772?year=2019&month=4&date=...
207 Ajaccio ... /flight-tracker/AF/4445?year=2019&month=4&date...
208 Ajaccio ... /flight-tracker/HOP/7772?year=2019&month=4&dat...
209 Madrid ... /flight-tracker/AV/6049?year=2019&month=4&date...
210 Madrid ... /flight-tracker/AA/8758?year=2019&month=4&date...
211 Madrid ... /flight-tracker/IB/3436?year=2019&month=4&date...
212 Setif ... /flight-tracker/AH/1108?year=2019&month=4&date...
213 Berlin ... /flight-tracker/ZI/608?year=2019&month=4&date=...
214 Berlin ... /flight-tracker/SS/6608?year=2019&month=4&date...
215 Toulon ... /flight-tracker/AF/7513?year=2019&month=4&date...
216 Toulon ... /flight-tracker/HOP/5513?year=2019&month=4&dat...
217 Perpignan ... /flight-tracker/AF/7465?year=2019&month=4&date...
218 Perpignan ... /flight-tracker/HOP/5465?year=2019&month=4&dat...
219 Rodez ... /flight-tracker/BE/7682?year=2019&month=4&date...
220 Nantes ... /flight-tracker/AF/7383?year=2019&month=4&date...
221 Nantes ... /flight-tracker/HOP/5383?year=2019&month=4&dat...
[222 rows x 13 columns]