
时间:2020-05-01 10:17:47

标签: python-3.x pandas dataframe python-requests

我正在尝试使用wikipedia模块和pandas来分析来自request网站的冠状病毒大流行数据,以对数据进行处理,并将其转换为对我有用的数据框。它本质上包含我在列级别使用droplevel修剪过的MultiIndex dataFrame,也有一些重复的列,例如:Locations,其中一个仅包含NaN,后来我已经使用duplicated()方法删除。




from __future__ import print_function
from signal import signal, SIGPIPE, SIG_DFL

import pandas as pd
import requests

pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)
pd.set_option('expand_frame_repr', True)

header = {"User-Agent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.75 Safari/537.36","X-Requested-With":"XMLHttpRequest"}

url = 'https://en.wikipedia.org/wiki/Template:2019%E2%80%9320_coronavirus_pandemic_data'

r = requests.get(url, headers=header)

df = pd.read_html(r.text)[0]
df.columns = df.columns.droplevel(1)
df.rename(columns={'Locations[a]': 'Locations',  'Recov.[d]': 'Recovered', 'Deaths[c]': 'Deaths', 'Cases[b]': 'Cases'}, inplace=True)
df = df[['Locations','Cases','Deaths','Recovered']]
df = df.loc[:,~df.columns.duplicated("last")]
# df['Locations'] = df['Locations'].str.replace(r"\[.*\]", "") <-- remove links from `Location` column whch are `[]`    


                                             Locations                                              Cases                                             Deaths                                          Recovered
0                                     United States[e]                                            1095645                                              63746                                             132544
1                                                Spain                                             213435                                              24543                                             112050
2                                                Italy                                             205463                                              27967                                              75945
3                                    United Kingdom[f]                                             171253                                              26771                                                  —
4                                           Germany[g]                                             163009                                               6623                                             117734
5                                            France[h]                                             129581                                              24376                                              49476
6                                               Turkey                                             120204                                               3174                                              48886
7                                            Russia[i]                                             114431                                               1169                                              13220
8                                                 Iran                                              94640                                               6028                                              75103
9                                               Brazil                                              87187                                               6006                                              35935
10                                            China[j]                                              82874                                               4633                                              77642
11                                              Canada                                              53236                                               3184                                              21423
12                                          Belgium[k]                                              49032                                               7703                                              11892
13                                      Netherlands[l]                                              39316                                               4795                                                  —
14                                                Peru                                              36976                                               1051                                              10405
15                                               India                                              35043                                               1147                                               8889
16                                         Switzerland                                              29586                                               1423                                              23400
17                                            Portugal                                              25045                                                989                                               1519
18                                             Ecuador                                              24934                                                900                                               1806
19                                        Saudi Arabia                                              22753                                                162                                               3163
20                                              Sweden                                              21092                                               2586                                               1005
21                                             Ireland                                              20612                                               1232                                              13386
22                                              Mexico                                              19224                                               1859                                              11423
23                                           Singapore                                              17101                                                 15                                               1188
24                                            Pakistan                                              16817                                                385                                               4315
25                                            Chile[m]                                              16023                                                227                                               8580
26                                           Israel[n]                                              16004                                                223                                               8758
27                                             Austria                                              15450                                                584                                              12907
28                                            Japan[o]                                              14516                                                466                                               3466
29                                             Belarus                                              14027                                                 89                                               2386
30                                               Qatar                                              13409                                                 10                                               1372
31                                              Poland                                              12877                                                644                                               3236
32                                United Arab Emirates                                              12481                                                105                                               2429
33                                             Romania                                              12240                                                695                                               4017
34                                          Ukraine[p]                                              10861                                                272                                               1413
35                                         South Korea                                              10774                                                248                                               9072
36                                           Indonesia                                              10551                                                800                                               1591
37                                          Denmark[q]                                               9158                                                452                                               6546
38                                           Serbia[r]                                               9009                                                179                                               1343
39                                         Philippines                                               8772                                                579                                               1084
40                                          Bangladesh                                               8238                                                170                                                174
41                                           Norway[s]                                               7759                                                210                                                  —
42                                      Czech Republic                                               7689                                                237                                               3314
43                                  Dominican Republic                                               6972                                                301                                               1301
44                                        Australia[t]                                               6762                                                 92                                               5720
45                                              Panama                                               6532                                                188                                                576
46                                            Colombia                                               6211                                                278                                               1411
47                                            Malaysia                                               6071                                                103                                               4210
48                                        South Africa                                               5647                                                103                                               2073
49                                            Egypt[u]                                               5537                                                392                                               1381
50                                          Finland[v]                                               4995                                                211                                               3000
51                                          Morocco[w]                                               4423                                                170                                                984
52                                        Argentina[x]                                               4415                                                218                                               1299
53                                              Kuwait                                               4024                                                 26                                               1539
54                                             Algeria                                               4006                                                450                                               1779
55                                          Moldova[y]                                               3897                                                119                                               1182
56                                          Luxembourg                                               3784                                                 90                                               3213
57                                          Kazakhstan                                               3551                                                 25                                                879
58                                             Bahrain                                               3040                                                  8                                               1500
59                                            Thailand                                               2960                                                 54                                               2719
60                                             Hungary                                               2863                                                323                                                609
61                                              Greece                                               2591                                                140                                               1374
62                                                Oman                                               2348                                                 11                                                495
63                                         Afghanistan                                               2335                                                 68                                                310
64                                             Armenia                                               2148                                                 33                                                977
65                                                Iraq                                               2085                                                 93                                               1375
66                                             Croatia                                               2076                                                 69                                               1348
67                                               Ghana                                               2076                                                 17                                                212
68                                          Uzbekistan                                               2046                                                  9                                               1134
69                                             Nigeria                                               1932                                                 58                                                319
70                                            Cameroon                                               1832                                                 61                                                934
71                                       Azerbaijan[z]                                               1804                                                 24                                               1325
72                                             Iceland                                               1797                                                 10                                               1670
73                                Bosnia & Herzegovina                                               1757                                                 69                                                727
74                                             Estonia                                               1694                                                 52                                                249
75                                            Bulgaria                                               1541                                                 66                                                276
76                                         Puerto Rico                                               1539                                                 92                                                  —
77                                            Cuba[aa]                                               1501                                                 61                                                681
78                                              Guinea                                               1495                                                  7                                                329
79                                     North Macedonia                                               1465                                                 77                                                738
80                                            Slovenia                                               1429                                                 91                                                233
81                                            Slovakia                                               1403                                                 23                                                558
82                                           Lithuania                                               1399                                                 45                                                594
83                                         Ivory Coast                                               1275                                                 14                                                574
84                                             Bolivia                                               1167                                                 62                                                132
85                                     New Zealand[ab]                                               1132                                                 19                                               1252
86                          USS Theodore Roosevelt[ac]                                               1102                                                  1                                                 53
87                                            Djibouti                                               1089                                                  2                                                642
88                               Charles de Gaulle[ad]                                               1081                                                  0                                                  0
89                                           Hong Kong                                               1038                                                  4                                                846
90                                             Tunisia                                                994                                                 41                                                305
91                                             Senegal                                                933                                                  9                                                334
92                                              Latvia                                                870                                                 16                                                348
93                                          Cyprus[ae]                                                850                                                 15                                                148
94                                            Honduras                                                804                                                 75                                                112
---------------------------- Stripped output ------------------
226                                            Comoros                                                  1                                                  0                                                  0
227                            Saint Pierre & Miquelon                                                  1                                                  0                                                  0
228  As of 1 May 2020 (UTC) · History of cases: Chi...  As of 1 May 2020 (UTC) · History of cases: Chi...  As of 1 May 2020 (UTC) · History of cases: Chi...  As of 1 May 2020 (UTC) · History of cases: Chi...
229  Notes ^ Countries, territories, and internatio...  Notes ^ Countries, territories, and internatio...  Notes ^ Countries, territories, and internatio...  Notes ^ Countries, territories, and internatio...


0 个答案:
