选择“香港股票”和“显示全部”按钮后,我尝试下载“ https://www.bsgroup.com.hk/BrightSmart/MarginRatio/StockMarginRatioEnquiry.aspx?Lang=eng”中的表格。我检查了Chrome /检查/网络功能。没有向服务器发送新数据的请求。因此,我怀疑数据在原始页面中。我检查了它是否在按下“显示全部”按钮后出现在“表1”中。我尝试了以下代码,但没有任何反应,请告知:
url="https://www.bsgroup.com.hk/BrightSmart/MarginRatio/StockMarginRatioEnquiry.aspx?Lang=eng"
result = requests.get(url)
result.raise_for_status()
result.encoding = "utf-8"
src = result.content
soup = BeautifulSoup(src, 'lxml')
table = soup.findAll("Table1")
output_rows = []
for table_row in table.findAll('tr'):
columns = table_row.findAll('td')
output_row = []
for column in columns:
output_row.append(column.text)
output_rows.append(output_row)
print(output_rows)
答案 0 :(得分:1)
要获取数据,必须使用正确的参数发出POST
请求。
例如:
import requests
from bs4 import BeautifulSoup
url = 'https://www.bsgroup.com.hk/BrightSmart/MarginRatio/StockMarginRatioEnquiry.aspx?Lang=eng'
with requests.session() as s:
soup = BeautifulSoup(s.get(url).text, 'html.parser')
data = {i['name']: i['value'] if 'value' in i.attrs else '' for i in soup.select('input[name]')}
del data['StockMarginRatioGrid$btnFind']
data['StockMarginRatioGrid$txtExchange'] = 'HKEX'
soup = BeautifulSoup(s.post(url, data=data).text, 'html.parser')
for tr in soup.select('#StockMarginRatioGrid_gridResult tr'):
print(''.join('{:^21}'.format(td.text) for td in tr.select('td')))
打印:
Stock Code Name Stock Margin Ratio Deposit Ratio Stock Code Name Stock Margin Ratio Deposit Ratio
1 CKHHOLDINGS 85% 15% 2 CLPHOLDINGS 85% 15%
3 HK&CHINAGAS 85% 15% 4 WHARFHOLDINGS 82% 18%
5 HSBCHOLDINGS 85% 15% 6 POWERASSETS 85% 15%
8 PCCW 75% 25% 10 HANGLUNGGROUP 75% 25%
11 HANGSENGBANK 85% 15% 12 HENDERSONLAND 85% 15%
14 HYSANDEV 75% 25% 16 SHKPPT 85% 15%
17 NEWWORLDDEV 85% 15% 18 ORIENTALPRESS 20% 80%
19 SWIREPACIFICA 85% 15% 20 WHEELOCK 82% 18%
23 BANKOFEASIA 75% 25% 25 CHEVALIERINT'L 40% 60%
... and so on.
编辑:要写入CSV文件,您可以使用以下示例:
import csv
import requests
from bs4 import BeautifulSoup
url = 'https://www.bsgroup.com.hk/BrightSmart/MarginRatio/StockMarginRatioEnquiry.aspx?Lang=eng'
with requests.session() as s, open('output.csv', 'w') as f_out:
writer = csv.writer(f_out)
soup = BeautifulSoup(s.get(url).text, 'html.parser')
data = {i['name']: i['value'] if 'value' in i.attrs else '' for i in soup.select('input[name]')}
del data['StockMarginRatioGrid$btnFind']
data['StockMarginRatioGrid$txtExchange'] = 'HKEX'
soup = BeautifulSoup(s.post(url, data=data).text, 'html.parser')
for tr in soup.select('#StockMarginRatioGrid_gridResult tr'):
writer.writerow([td.text.strip() for td in tr.select('td')])