创建具有多个数据框和多个条件的列

时间:2018-07-25 13:57:05

标签: python pandas merge

我正在查看足球数据并尝试添加一个对手列,但是正在为数据框的组织方式而苦苦挣扎。

****编辑****

defense = {'week': [1, 1, 1, 1, 2, 2, 2, 2], 'team': ['GB', 'MIA', 'CHI',       'DET', 'GB', 'MIA', 'CHI', 'DET']}
games = {'week': [1, 1, 2, 2], 'winner': ['GB', 'MIA', 'GB', 'DET'], 'loser': ['CHI', 'DET', 'MIA', 'CHI']}
def_df = pd.DataFrame(data=defense)
games_df = pd.DataFrame(data=games)

def_df

  team  week
0   GB     1
1  MIA     1
2  CHI     1
3  DET     1
4   GB     2
5  MIA     2
6  CHI     2
7  DET     2

games_df

  loser  week winner
0   CHI     1     GB
1   DET     1    MIA
2   MIA     2     GB
3   CHI     2    DET

我希望根据该周添加一个防御['Opponent']栏。

  team  week    Opponent 
0   GB     1    CHI
1  MIA     1    DET
2  CHI     1    GB
3  DET     1    MIA
4   GB     2    MIA
5  MIA     2    GB
6  CHI     2    DET
7  DET     2    CHI

def_2017 dataframe

schedule dataframe

谢谢!

2 个答案:

答案 0 :(得分:2)

这是使用嵌套字典映射的一种方法:

from collections import defaultdict

d = defaultdict(dict)
for row in games_df.itertuples(index=False):
    d[row.week].update({row.winner: row.loser, row.loser: row.winner})

def_df['opponent'] = def_df.apply(lambda x: d[x['week']][x['team']], axis=1)

print(def_df)

  team  week opponent
0   GB     1      CHI
1  MIA     1      DET
2  CHI     1       GB
3  DET     1      MIA
4   GB     2      MIA
5  MIA     2       GB
6  CHI     2      DET
7  DET     2      CHI

使用元组键的同等有效替代方案,避免使用collections

d = {}
for row in games_df.itertuples(index=False):
    d[(row.week, row.winner)] = row.loser
    d[(row.week, row.loser)] = row.winner

def_df['opponent'] = def_df.set_index(['week', 'team']).index.map(d.get)

答案 1 :(得分:1)

已更新

创建一列对手

opponent_list = []
for team, week in zip(def_df['team'],def_df['week']):
    for gameweek, winner, loser in zip(games_df['week'],games_df['winner'],games_df['loser']):
        if gameweek == week and (winner ==team or loser ==team):
            if winner == team:
                opponent_list.append(loser)
            else:
                opponent_list.append(winner)
def_df['opponent'] = opponent_list