我试图从一张超过46页的表格中获取ETF的代码:
我的代码是
import bs4 as bs
import pickle
import requests
def save_ETF_tickers():
resp = requests.get('http://etfdb.com/type/region/north-america/us/#etfs&sort_name=assets_under_management&sort_order=desc&page=1')
soup = bs.BeautifulSoup(resp.text, "lxml")
table = soup.find('table',{'class': 'table mm-mobile-table table-module2 table-default table-striped table-hover table-pagination'})
tickers = []
for row in table.findAll('tr')[1:26]:
ticker = row.findAll('td')[0].text
tickers.append(ticker)
with open("ETFtickers.pickle", "wb") as f:
pickle.dump(tickers, f)
print(tickers)
return tickers
save_ETF_tickers()
我知道这个只检查“page = 1”但我无法弄清楚如何从所有46个页面中检索数据。
非常感谢您的帮助
答案 0 :(得分:0)
您可以使用etfdb-api
Node.js软件包:https://www.npmjs.com/package/etfdb-api
它为您提供:
这是一个示例JSON响应:
{
"symbol": {
"type": "link",
"text": "VIXM",
"url": "/etf/VIXM/"
},
"name": {
"type": "link",
"text": "ProShares VIX Mid-Term Futures ETF",
"url": "/etf/VIXM/"
},
"mobile_title": "VIXM - ProShares VIX Mid-Term Futures ETF",
"price": "$26.47",
"assets": "$48.21",
"average_volume": "69,873",
"ytd": "25.15%",
"overall_rating": {
"type": "restricted",
"url": "/members/join/"
},
"asset_class": "Volatility"
},
{
"symbol": {
"type": "link",
"text": "DGBP",
"url": "/etf/DGBP/"
},
"name": {
"type": "link",
"text": "VelocityShares Daily 4x Long USD vs GBP ETN",
"url": "/etf/DGBP/"
},
"mobile_title": "DGBP - VelocityShares Daily 4x Long USD vs GBP ETN",
"price": "$30.62",
"assets": "$4.85",
"average_volume": "1,038",
"ytd": "25.13%",
"overall_rating": {
"type": "restricted",
"url": "/members/join/"
},
"asset_class": "Currency"
}
免责声明:我是作者。 :)