Question

我使用以下方法从网站提取数据：

import pandas as pd  
import matplotlib.pyplot as plt

stat_dict={'Disposals' : 0,
           'Kicks' : 1,
           'Marks' : 2,
           'Handballs' : 3,
           'Goals' : 4,
           'Behinds' : 5,
           'Hitouts' : 6,
           'Tackles' : 7,
           'Rebounds' : 8,
           'Inside50s' : 9,
           'Clearances': 9,
           'Clangers' : 10,
           'FreesFor' : 11,
           'FreesAgainst' : 12,
           'ContestedPosessions' : 13,
           'UncontestedPosesseions' : 14,
           'ContestedMarks' : 15,
           'MarksInside50' : 16,
           'OnePercenters' : 17,
           'Bounces' : 18,
           'GoalAssists' : 19,
           'Timeplayed' : 20}

team_lower_case='fremantle'
player1="Fyfe, Nat"
stat_required='Disposals'
rounds=8



tables = pd.read_html("https://afltables.com/afl/stats/teams/" +str(team_lower_case)+"/2018_gbg.html")

for df in tables:
    df.drop(df.columns[rounds+1:], axis=1, inplace=True)   # remove unwanted columns
    df.columns = df.columns.droplevel(0)    # remove extra index level


stat_table=tables[stat_dict[stat_required]]

player_stat=stat_table[stat_table["Player"]==player1]

产生以下内容：

      Player    R1    R2    R3    R4    R5    R6    R7    R8
8  Fyfe, Nat  22.0  29.0  38.0  25.0  43.0  27.0  33.0  36.0

如何在“＆＃39;播放器＆＃39;”

列中为数据设置索引？

如何将列标题转为行标签，反之亦然？ 我正在寻找的输出如下：

 Round  Fyfe, Nat  Neale,Lachie
    R1        22             37
    R2        29             28

我想得到的最终输出是散点图，在x轴上绘制列标题，例如R1，R2等......以及y轴上的行数据。

我觉得我应该可以直接绘制数据框，但我取得成功的唯一方法是执行以下操作：

for round, disp in player_stat.iterrows():
    player1_list=[]
    player1_list.append(disp)

plt.style.use('ggplot')
plt.scatter(range(1,rounds+1), player1_list, label=player1)
plt.legend(loc="lower right")
plt.title("Disposals per round")
plt.xlabel("Rounds")
plt.ylabel("Disposals")
plt.ylim(ymin=0)

似乎我应该使用.transpose，它接近我想要的并输出以下内容：

                8
Player  Fyfe, Nat
R1             22
R2             29
R3             38
R4             25
R5             43
R6             27
R7             33
R8             36

Answer 1

set_index使用T和How to simulate HTTP post request using Python Requests module?：

stat_table=tables[stat_dict[stat_required]].set_index('Player').T

print (stat_table)
Player  Ballantyne, Hayden  Banfield, Bailey  Blakely, Connor  \
R1                    10.0              12.0             17.0   
R2                    11.0              14.0             30.0   
R3                    18.0              11.0             21.0   
R4                     9.0              17.0             16.0   
R5                    14.0              18.0             23.0   
R6                     6.0              14.0             31.0   
R7                    14.0              20.0             21.0   
R8                    11.0              12.0             35.0   

Player  Brayshaw, Andrew  Cerra, Adam  Cox, Brennan  Crowden, Mitch  \
R1                  12.0          NaN           NaN             NaN   
R2                  16.0          9.0           NaN            11.0   
R3                   7.0         10.0           NaN            13.0   
R4                  11.0          7.0           NaN            15.0   
R5                  17.0         15.0           NaN            17.0   
R6                  14.0         11.0           NaN            11.0   
R7                  14.0         18.0           8.0             8.0   
R8                  16.0         14.0          12.0            10.0   

Player  Duman, Taylin  Fyfe, Nat  Hamling, Joel   ...    Pearce, Danyle  \
R1                NaN       22.0            NaN   ...              12.0   
R2                NaN       29.0           12.0   ...               NaN   
R3                NaN       38.0           11.0   ...               NaN   
R4                NaN       25.0           16.0   ...               NaN   
R5               15.0       43.0           11.0   ...               NaN   
R6               12.0       27.0            9.0   ...               NaN   
R7                8.0       33.0           12.0   ...               NaN   
R8               15.0       36.0           18.0   ...              18.0   

Player  Ryan, Luke  Sandilands, Aaron  Sheridan, Tom  Sutcliffe, Cameron  \
R1            16.0               16.0            NaN                17.0   
R2            22.0               10.0            NaN                 NaN   
R3            21.0               14.0           14.0                 NaN   
R4            14.0               15.0           11.0                 NaN   
R5            21.0                9.0            NaN                 NaN   
R6            18.0               10.0            NaN                 NaN   
R7            13.0               11.0            NaN                 NaN   
R8            27.0               13.0            NaN                 NaN   

Player  Taberner, Matthew  Tucker, Darcy  Walters, Michael  Wilson, Nathan  \
R1                   20.0           12.0              18.0            15.0   
R2                   16.0            NaN              26.0            23.0   
R3                   18.0            NaN              23.0            18.0   
R4                   19.0           15.0              21.0            14.0   
R5                    6.0           17.0              18.0            27.0   
R6                    NaN           17.0               2.0            15.0   
R7                    NaN           15.0               NaN            19.0   
R8                    NaN           13.0               NaN             NaN   

Player  Totals  
R1       358.0  
R2       399.0  
R3       387.0  
R4       362.0  
R5       407.0  
R6       356.0  
R7       346.0  
R8       404.0  

[8 rows x 32 columns]

然后可以按player1选择列：

player_stat=stat_table[player1]
print (player_stat)
R1    22.0
R2    29.0
R3    38.0
R4    25.0
R5    43.0
R6    27.0
R7    33.0
R8    36.0
Name: Fyfe, Nat, dtype: float64

最后plot：

plt.scatter((range(1,rounds+1)), stat_table[player1])

Answer 2

您可以尝试：

df = pd.DataFrame({'Player':['Fyfe, Nat','Neale,Lachie'],
                'r1':[22,29],
                'r2': [27, 25],
                'r3': [30, 21]})

print df
         Player  r1  r2  r3
0     Fyfe, Nat  22  27  30
1  Neale,Lachie  29  25  21

df = df.set_index('Player').T
print df

输出：

Player  Fyfe, Nat  Neale,Lachie
r1             22            29
r2             27            25
r3             30            21

Pandas：将列标题旋转到行标签

2 个答案: