熊猫-转换dtypes

时间:2020-05-10 17:16:40

标签: python pandas

我需要合并这两个数据框:

R.string.latest_version_installed

df_melt

dtype: object MatchID GameWeek Date Team Home AgainstTeam 0 46605 1 2019-08-09 Liverpool Home Norwich City 1 46605 1 2019-08-09 Norwich City Away Liverpool 2 46606 1 2019-08-10 AFC Bournemouth Home Sheffield United 3 46606 1 2019-08-10 Sheffield United Away AFC Bournemouth 4 46607 1 2019-08-10 Burnley Home Southampton .. ... ... ... ... ... ... 533 46871 27 2020-02-23 Watford Away Manchester United 534 46872 27 2020-02-22 Sheffield United Home Brighton and Hove Albion 535 46872 27 2020-02-22 Brighton and Hove Albion Away Sheffield United 536 46873 27 2020-02-22 Southampton Home Aston Villa 537 46873 27 2020-02-22 Aston Villa Away Southampton

df_pm

这是我尝试执行合并的方式:

dtype: object                                        Player  GameWeek  Minutes  ... CloseShotCreated TotalShotCreated  HeadersCreated
PlayerMatchesDetailID                                             ...                                                  
1                                     Alisson         1       90  ...                0                0               0
2                             Virgil van Dijk         1       90  ...                0                0               0
3                                Joseph Gomez         1       90  ...                0                1               0
4                            Andrew Robertson         1       90  ...                0                1               0
5                      Trent Alexander-Arnold         1       90  ...                3                3               1
...                                       ...       ...      ...  ...              ...              ...             ...
15053                             Matty James        22        0  ...                0                0               0
15054                             Matty James        23        0  ...                0                0               0
15055                             Matty James        24        0  ...                0                0               0
15056                             Matty James        25        0  ...                0                0               0
15057                             Matty James        26        0  ...                0                0               0

但是我得到了:

#Instantiate an empty list
match_ids = []
home_away = []
dates = []

#For each row in the player matches dataframe...
for row in df_pm.itertuples():
    #Look up the match id from the team matches dataframe
    team = row.ForTeam
    againstteam = row.AgainstTeam
    gameweek = row.GameWeek

    match_id = df_melt.loc[(df_melt['GameWeek']==gameweek)
                          &(df_melt['Team']==team)
                          &(df_melt['AgainstTeam']==againstteam),
                          'MatchID'].item()
    print ('MATCH',match_id)

    date = df_melt.loc[(df_melt['GameWeek']==gameweek)
                          &(df_melt['Team']==team)
                          &(df_melt['AgainstTeam']==againstteam),
                          'Date'].item()

    home = df_melt.loc[(df_melt['GameWeek']==gameweek)
                          &(df_melt['Team']==team)
                          &(df_melt['AgainstTeam']==againstteam),
                          'Home'].item()

    #Add it to the list
    match_ids.append(match_id)
    home_away.append(home)
    dates.append(date)

建议可能不存在某些行。但是在打印完整个数据框后,我发现没有缺陷数据。

但是当我检查类型时,我会看到:

Traceback (most recent call last): File "tableau_data_generation.py", line 161, in <module> 'MatchID'].item() File "/Users/me/anaconda2/envs/data_science/lib/python3.7/site-packages/pandas/core/base.py", line 652, in item return self.values.item() ValueError: can only convert an array of size 1 to a Python scalar

df_melt

MatchID object GameWeek object Date object Team object Home object AgainstTeam object

df_pm

我猜想这一定是罪魁祸首...


解决此问题并转换不匹配类型的最佳方法是什么?


注意:提供的解决方案:

后来,我需要执行以下任务:

Player                 object
GameWeek                int64
Minutes                 int64
ForTeam                object
AgainstTeam            object
Goals                   int64
ShotsOnTarget           int64
ShotsInBox              int64
CloseShots              int64
TotalShots              int64
Headers                 int64
GoalAssists             int64
ShotOnTargetCreated     int64
ShotInBoxCreated        int64
CloseShotCreated        int64
TotalShotCreated        int64
HeadersCreated          int64

1 个答案:

答案 0 :(得分:0)

要转换数据框的列,请执行以下操作:

Df[column_name]=Df[column_name].astype(datatype)

即。 “数据类型”为intstrfloat