我是Python的初学者,我对从互联网获取数据了解不多。我在这里使用的这种方法在获取和打印IMDB Top 250电影中起作用。所以我想对this coronavirus data做同样的事情。但是与IMDB数据不同,程序不会将项目视为列表。我看不到与IMDB数据有太大差异。那么,如何使用简单的请求和漂亮的汤料至少打印国家的名称?
import requests
from bs4 import BeautifulSoup
url = requests.get("https://www.worldometers.info/coronavirus/")
soup = BeautifulSoup(url.content, "html.parser")
new_soup = soup.find_all("table", {"id":"main_table_countries_today"})
country_table = new_soup[0].contents[3]
country_table = country_table.find_all("tr")
for country in country_table:
country_name = country.find_all("td", {"style":"font-weight: bold; font-size:15px; text-align:left;"})
print(country_name[0].text)
答案 0 :(得分:1)
我一直从GitHub repository的John Hopkins University中获取数据,据认为这是一个有信誉的来源:
names = ('confirmed', 'deaths', 'recovered')
src_base = 'https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_{name}_global.csv'
可以用requests
感染:
import requests
for name, url in src.items():
response = requests.get(url)
并方便地转换为Pandas数据框:
import io
import pandas
dfs = {}
for name, url in src.items():
response = requests.get(url)
dfs[name] = pd.read_csv(io.BytesIO(response.content))
print(name, url)
print(dfs[name])
confirmed https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv
Province/State Country/Region ... 4/13/20 4/14/20
0 NaN Afghanistan ... 665 714
1 NaN Albania ... 467 475
2 NaN Algeria ... 1983 2070
3 NaN Andorra ... 646 659
4 NaN Angola ... 19 19
.. ... ... ... ... ...
259 Saint Pierre and Miquelon France ... 1 1
260 NaN South Sudan ... 4 4
261 NaN Western Sahara ... 6 6
262 NaN Sao Tome and Principe ... 4 4
263 NaN Yemen ... 1 1
[264 rows x 88 columns]
deaths https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_deaths_global.csv
Province/State Country/Region ... 4/13/20 4/14/20
0 NaN Afghanistan ... 21 23
1 NaN Albania ... 23 24
2 NaN Algeria ... 313 326
3 NaN Andorra ... 29 31
4 NaN Angola ... 2 2
.. ... ... ... ... ...
259 Saint Pierre and Miquelon France ... 0 0
260 NaN South Sudan ... 0 0
261 NaN Western Sahara ... 0 0
262 NaN Sao Tome and Principe ... 0 0
263 NaN Yemen ... 0 0
[264 rows x 88 columns]
recovered https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_recovered_global.csv
Province/State Country/Region ... 4/13/20 4/14/20
0 NaN Afghanistan ... 32 40
1 NaN Albania ... 232 248
2 NaN Algeria ... 601 691
3 NaN Andorra ... 128 128
4 NaN Angola ... 4 5
.. ... ... ... ... ...
245 Saint Pierre and Miquelon France ... 0 0
246 NaN South Sudan ... 0 0
247 NaN Western Sahara ... 0 0
248 NaN Sao Tome and Principe ... 0 0
249 NaN Yemen ... 0 0
[250 rows x 88 columns]
您最终可能会有一些快速绘图:
可用的完整代码here。