我正试图在下面的网页上进行分析,以获取在交易所中一直处于高位或低位的股票名称。
https://www.bseindia.com/markets/equity/EQReports/HighLow.html?Flag=H#
但是,当我使用漂亮的汤下载网页并检查数据时,仅显示一半库存,这是因为 该页面有2页,因此使用上述方法,一页上有25张库存,另一页上有25张库存,我只能解析第一页, 如果我单击第二页,URL也相同,请帮助我如何解决此问题?
答案 0 :(得分:1)
该站点具有api端点,该端点以一种不错的json格式将数据返回给您。您可以获取json格式的响应,然后对其进行规范化以创建表。现在,当它执行此操作时,它将返回2个表,所以我不确定是否要第二个表。如果没有,我将它们分别存储,然后附加它们以将它们放在一起。
import requests
from pandas.io.json import json_normalize
url = 'https://api.bseindia.com/BseIndiaAPI/api/MktHighLowData/w?Grpcode=&HLflag=H&indexcode=&scripcode='
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.109 Safari/537.36'}
payload = {
'Grpcode':'',
'HLflag': 'H',
'indexcode':'' ,
'scripcode':'' }
jsonObj = requests.get(url, headers=headers, params=payload).json()
df_table = json_normalize(jsonObj['Table'])
df_table1 = json_normalize(jsonObj['Table1'])
df = df_table.append(df_table1)
输出:
print (df)
ALLTimeHigh ... dt_tm
0 1019.95 ... 2019-02-25T16:00:03
1 263.00 ... 2019-02-25T16:00:03
2 24.00 ... 2019-02-25T16:00:03
3 35.90 ... 2019-02-25T16:00:03
4 29.75 ... 2019-02-25T16:00:03
5 43.00 ... 2019-02-25T16:00:03
6 140.40 ... 2019-02-25T16:00:03
7 15.39 ... 2019-02-25T16:00:03
8 724.00 ... 2019-02-25T16:00:03
9 1495.00 ... 2019-02-25T16:00:03
10 123.15 ... 2019-02-25T16:00:03
11 121.00 ... 2019-02-25T16:00:03
12 238.50 ... 2019-02-25T16:00:03
13 89.00 ... 2019-02-25T16:00:03
14 819.95 ... 2019-02-25T16:00:03
15 112.40 ... 2019-02-25T16:00:03
16 49.95 ... 2019-02-25T16:00:03
17 330.85 ... 2019-02-25T16:00:03
18 167.45 ... 2019-02-25T16:00:03
19 25.10 ... 2019-02-25T16:00:03
20 940.00 ... 2019-02-25T16:00:03
21 165.00 ... 2019-02-25T16:00:03
22 NaN ... 2019-02-25T16:00:03
23 239.00 ... 2019-02-25T16:00:03
24 151.55 ... 2019-02-25T16:00:03
25 34.35 ... 2019-02-25T16:00:03
26 256.15 ... 2019-02-25T16:00:03
27 49.75 ... 2019-02-25T16:00:03
28 103.25 ... 2019-02-25T16:00:03
29 50.50 ... 2019-02-25T16:00:03
.. ... ... ...
87 135.00 ... 2019-02-25T16:00:03
88 219.80 ... 2019-02-25T16:00:03
89 58.00 ... 2019-02-25T16:00:03
90 494.00 ... 2019-02-25T16:00:03
91 285.30 ... 2019-02-25T16:00:03
92 55.65 ... 2019-02-25T16:00:03
93 4.45 ... 2019-02-25T16:00:03
94 50.00 ... 2019-02-25T16:00:03
95 50.00 ... 2019-02-25T16:00:03
96 92.50 ... 2019-02-25T16:00:03
97 154.80 ... 2019-02-25T16:00:03
98 82.40 ... 2019-02-25T16:00:03
99 293.85 ... 2019-02-25T16:00:03
100 396.00 ... 2019-02-25T16:00:03
101 98.00 ... 2019-02-25T16:00:03
102 144.60 ... 2019-02-25T16:00:03
103 11.50 ... 2019-02-25T16:00:03
104 42.95 ... 2019-02-25T16:00:03
105 313.00 ... 2019-02-25T16:00:03
106 1120.00 ... 2019-02-25T16:00:03
107 87.00 ... 2019-02-25T16:00:03
108 82.00 ... 2019-02-25T16:00:03
109 214.00 ... 2019-02-25T16:00:03
110 505.00 ... 2019-02-25T16:00:03
111 1525.00 ... 2019-02-25T16:00:03
112 220.00 ... 2019-02-25T16:00:03
113 36.00 ... 2019-02-25T16:00:03
114 170.00 ... 2019-02-25T16:00:03
115 549.50 ... 2019-02-25T16:00:03
116 4990.00 ... 2019-02-25T16:00:03
[168 rows x 19 columns]