希望在下面的 url 中抓取主硬币表的全部内容。
但是,我下面的代码似乎不起作用:
import pandas as pd
url = 'https://messari.io/screener/coinbase-ventures-portfolio-34D634C4'
html = requests.get(url).content
df_list = pd.read_html(html)
df = df_list[-1]
print(df)
我哪里出错了?
答案 0 :(得分:2)
您可以直接从其呈现的源中获取数据:
import requests
import pandas as pd
url = 'https://data.messari.io/api/v1/markets/prices-legacy'
headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.93 Safari/537.36'}
jsonData = requests.get(url, headers=headers).json()
data = pd.json_normalize(jsonData['data'])
输出:
print(data)
id ... stakingEngagedPercent
0 1e31218a-e44e-4285-820c-8282ee222035 ... NaN
1 21c795f5-1bfd-40c3-858e-e9d7e820c6d0 ... NaN
2 7dc551ba-cfed-4437-a027-386044415e3e ... NaN
3 97775be0-2608-4720-b7af-f85b24c7eb2d ... NaN
4 51f8ea5e-f426-4f40-939a-db7e05495374 ... NaN
... ... ...
1609 ff4f6990-5333-4e75-81cb-1342af9cc0a1 ... NaN
1610 ffae284d-cb73-44e5-8934-cb3658284e46 ... NaN
1611 ffaebc24-053e-428e-a84d-be836e4f8a3a ... NaN
1612 ffc64018-c724-44ac-b3d0-00e33dff7615 ... NaN
1613 ffde2011-560a-458b-abaa-2b4f20f851a2 ... NaN
[1614 rows x 177 columns]
答案 1 :(得分:0)
页面是动态的,它不包含表格,当你下载它时,你得到的是一些将用于呈现页面的脚本。