分配给数据帧时的值错误

时间:2018-06-14 06:24:45

标签: python-3.x pandas

我正在为一个数据帧分配不同的数据。我有以下

  

ValueError: If using all scalar values, you must pass an index

我按照其他Here

的问题发帖

但它没有成功。

以下是我的代码。您所要做的就是将代码复制并粘贴到IDE。

import pandas as pd
import numpy as np

#Loading Team performance Data (ExpG (Home away)) For and against
epl_1718 = pd.read_csv("http://www.football-data.co.uk/mmz4281/1718/E0.csv")

epl_1718 = epl_1718[['HomeTeam','AwayTeam','FTHG','FTAG']]

epl_1718 = epl_1718.rename(columns={'FTHG': 'HomeGoals', 'FTAG': 'AwayGoals'})
Home_goal_avg = epl_1718['HomeGoals'].mean()
Away_goal_avg = epl_1718['AwayGoals'].mean()


Home_team_goals        = epl_1718.groupby(['HomeTeam'])['HomeGoals'].sum()
Home_count             = epl_1718.groupby(['HomeTeam'])['HomeTeam'].count()
Home_team_avg_goal     = Home_team_goals/Home_count
Home_team_concede      = epl_1718.groupby(['HomeTeam'])['AwayGoals'].sum()
EPL_Home_average_score = epl_1718['HomeGoals'].mean()
EPL_Home_average_conc  = epl_1718['HomeGoals'].mean()
Home_team_avg_conc     = Home_team_concede/Home_count

Away_team_goals        = epl_1718.groupby(['AwayTeam'])['AwayGoals'].sum()
Away_count             = epl_1718.groupby(['AwayTeam'])['AwayTeam'].count()
Away_team_avg_goal     = Away_team_goals/Away_count
Away_team_concede      = epl_1718.groupby(['AwayTeam'])['HomeGoals'].sum()
EPL_Away_average_score = epl_1718['AwayGoals'].mean()
EPL_Away_average_conc  = epl_1718['HomeGoals'].mean()
Away_team_avg_conc     = Away_team_concede/Away_count



Home_attk_sth = Home_team_avg_goal/EPL_Home_average_score
Home_attk_sth = Home_attk_sth.sort_index().reset_index()

Home_def_sth  = Home_team_avg_conc/EPL_Home_average_conc
Home_def_sth  = Home_def_sth .sort_index().reset_index()

Away_attk_sth = Away_team_avg_goal/EPL_Away_average_score
Away_attk_sth = Away_attk_sth .sort_index().reset_index()


Away_def_sth  = Away_team_avg_conc/EPL_Away_average_conc
Away_def_sth = Away_def_sth.sort_index().reset_index()

Home_def_sth
HomeTeam = epl_1718['HomeTeam'].drop_duplicates().sort_index().reset_index().set_index('HomeTeam')
AwayTeam = epl_1718['AwayTeam'].drop_duplicates().sort_index().reset_index().sort_values(['AwayTeam']).set_index(['AwayTeam'])
#HomeTeam = HomeTeam.sort_index().reset_index()

Team = HomeTeam.append(AwayTeam).drop_duplicates()





Data = pd.DataFrame({"Team":Team,
                     "Home_attkacking":Home_attk_sth,
                     "Home_def": Home_def_sth,
                     "Away_Attacking":Away_attk_sth,
                     "Away_def":Away_def_sth,
                     "EPL_Home_avg_score":EPL_Home_average_score,
                     "EPL_Home_average_conc":EPL_Home_average_conc,
                     "EPL_Away_average_score":EPL_Away_average_score,
                     "EPL_Away_average_conc":EPL_Away_average_conc},
                    columns =['Team','Home_attacking','Home_def','Away_attacking','Away_def',
                             'EPL_Home_avg_score','EPL_Home_avg_conc','EPL_Away_avg_score','EPL_Away_average_conc'])

在这段代码中,我要做的是获得每队每场比赛的平均目标得分,每队每场比赛的平均得分。 然后我正在计算其他表现因素,如攻击力,防守力等等。

我必须像使用示例一样粘贴代码,创建数据框会起作用。 感谢您的理解。 提前感谢您的建议。

最终数据框的格式(或列)如下所示:

  

Team Home Attacking Home Defensive Away attacking away defensive

等数据框中提到的。

这意味着,团队专栏下只有20支球队 数据帧的形状为(20,9)

此致

1 个答案:

答案 0 :(得分:1)

这里的主要想法是删除reset_index Series,其中teams为索引,因此不需要变量Team,并且reset_index创建了最后一步。另外,请注意DataFrame构造函数中的列名称,如果在字典中更改了EPL_Home_average_conc,然后EPL_Home_avg_conc获取NaN列:

Home_team_goals        = epl_1718.groupby(['HomeTeam'])['HomeGoals'].sum()
Home_count             = epl_1718.groupby(['HomeTeam'])['HomeTeam'].count()
Home_team_avg_goal     = Home_team_goals/Home_count
Home_team_concede      = epl_1718.groupby(['HomeTeam'])['AwayGoals'].sum()
EPL_Home_average_score = epl_1718['HomeGoals'].mean()
EPL_Home_average_conc  = epl_1718['HomeGoals'].mean()
Home_team_avg_conc     = Home_team_concede/Home_count

Away_team_goals        = epl_1718.groupby(['AwayTeam'])['AwayGoals'].sum()
Away_count             = epl_1718.groupby(['AwayTeam'])['AwayTeam'].count()
Away_team_avg_goal     = Away_team_goals/Away_count
Away_team_concede      = epl_1718.groupby(['AwayTeam'])['HomeGoals'].sum()
EPL_Away_average_score = epl_1718['AwayGoals'].mean()
EPL_Away_average_conc  = epl_1718['HomeGoals'].mean()
Away_team_avg_conc     = Away_team_concede/Away_count


#removed reset_index
Home_attk_sth = Home_team_avg_goal/EPL_Home_average_score
Home_attk_sth = Home_attk_sth.sort_index()

Home_def_sth  = Home_team_avg_conc/EPL_Home_average_conc
Home_def_sth  = Home_def_sth .sort_index()

Away_attk_sth = Away_team_avg_goal/EPL_Away_average_score
Away_attk_sth = Away_attk_sth .sort_index()


Away_def_sth  = Away_team_avg_conc/EPL_Away_average_conc
Away_def_sth = Away_def_sth.sort_index()
Data = pd.DataFrame({"Home_attacking":Home_attk_sth,
                     "Home_def": Home_def_sth,
                     "Away_attacking":Away_attk_sth,
                     "Away_def":Away_def_sth,
                     "EPL_Home_average_score":EPL_Home_average_score,
                     "EPL_Home_average_conc":EPL_Home_average_conc,
                     "EPL_Away_average_score":EPL_Away_average_score,
                     "EPL_Away_average_conc":EPL_Away_average_conc},
                    columns =['Home_attacking','Home_def','Away_attacking','Away_def',
                              'EPL_Home_average_score','EPL_Home_average_conc',
                              'EPL_Away_average_score','EPL_Away_average_conc'])

#column from index
Data = Data.rename_axis('Team').reset_index()
print (Data)