Python列表到索引数据框

时间:2018-08-29 12:00:58

标签: python pandas list dataframe indexing

我有一个已编制索引并打印的列表;此处显示的示例:

[Home Team                     season   
 1. FC Kaiserslautern          2010/2011     48
                               2011/2012     24
 1. FC Köln                    2008/2009     35
                               2009/2010     33
                               2010/2011     47
                               2011/2012     39
                               2014/2015     34
                               2015/2016     38
 1. FC Nürnberg                2009/2010     32
                               2010/2011     47
                               2011/2012     38
                               2012/2013     39
                               2013/2014     37

我无法将其转换为相同格式的pandas数据框。使用df = pd.DataFrame(df)会创建一个单行数据帧,其中所有内容都是成束的。

我获取列表的代码是:

df = []
home_goals = leaguesFinal.groupby(('Home Team', 'season'))['home_team_goal'].sum()
away_goals = leaguesFinal.groupby(('Away Team', 'season'))['away_team_goal'].sum()
df.append((home_goals + away_goals))

我只是想总结每个球队每个赛季的主场进球和客场进球。如果有更好的方法可以做到这一点,那么我通常会全神贯注。最后,我想要一个数据框,以便于操作。

1 个答案:

答案 0 :(得分:0)

我认为与rename_axis相同的MultiIndex名称需要add,对于DataFrame请使用reset_index

leaguesFinal = pd.DataFrame({
    'Home Team': ['b','a','a','c','b','a'],
    'Away Team': ['a','b','c','a','a','b'],
    'season': ['2010/2011'] * 3 + ['2012/2013'] * 3,
    'home_team_goal': [1,2,3,4,3,2],
    'away_team_goal': [4,6,7,8,2,1]
})
print (leaguesFinal)
  Home Team Away Team     season  home_team_goal  away_team_goal
0         b         a  2010/2011               1               4
1         a         b  2010/2011               2               6
2         a         c  2010/2011               3               7
3         c         a  2012/2013               4               8
4         b         a  2012/2013               3               2
5         a         b  2012/2013               2               1

home_goals = leaguesFinal.groupby(['Home Team', 'season'])['home_team_goal'].sum()
away_goals = leaguesFinal.groupby(['Away Team', 'season'])['away_team_goal'].sum()

print (home_goals)
Home Team  season   
a          2010/2011    5
           2012/2013    2
b          2010/2011    1
           2012/2013    3
c          2012/2013    4
Name: home_team_goal, dtype: int64

print (away_goals)
Away Team  season   
a          2010/2011     4
           2012/2013    10
b          2010/2011     6
           2012/2013     1
c          2010/2011     7
Name: away_team_goal, dtype: int64

a = home_goals.rename_axis(['Team','season'])
b = away_goals.rename_axis(['Team','season'])
df = (a.add(b, fill_value=0)).reset_index(name='both')
print (df)
  Team     season  both
0    a  2010/2011   9.0
1    a  2012/2013  12.0
2    b  2010/2011   7.0
3    b  2012/2013   4.0
4    c  2010/2011   7.0
5    c  2012/2013   4.0