嗨,我有一个数据框,如下所示:
df = pd.DataFrame()
df['Team1'] = ['A','B','C','D','E','F','A','B','C','D','E','F']
df['Score1'] = [1,2,3,1,2,4,1,2,3,1,2,4]
df['Team2'] = ['U','V','W','X','Y','Z','U','V','W','X','Y','Z']
df['Score2'] = [2,1,2,2,3,3,2,1,2,2,3,3]
df['Match'] = df['Team1'] + ' Vs '+ df['Team2']
df['Match_no']= [1,2,3,4,5,6,1,2,3,4,5,6]
df['model'] = ['ELO','ELO','ELO','ELO','ELO','ELO','xG','xG','xG','xG','xG','xG']
winner = df.Score1>df.Score2
df['winner'] = np.where(winner,df['Team1'],df['Team2'])
我要做的是为下一阶段的比赛创建另一个日期框架。在下一阶段,每个型号(ELO和xG)将有3个匹配项。我想按型号进行分组。这些比赛按照模型进行分组,第1场比赛和第1场比赛的获胜者,第3场比赛与第4场比赛的获胜者等等。(例如,U vs B,C vs X,Y vs F )。然后有人可以建议我如何选拔那些队伍吗?
我期望的新数据框如下:
df1 =pd.DataFrame()
df1['Team1'] = ['U','C','Y','U','C','Y']
df1['Team2'] = ['B','X','F','B','X','F']
df1['Match'] = df1['Team1'] + ' Vs '+ df1['Team2']
df1['Match_no']= [1,2,3,1,2,3]
df1['model'] = ['ELO','ELO','ELO','xG','xG','xG']
我该如何设置? 谢谢,
Zep
答案 0 :(得分:1)
尽管我很难理解“奇数比赛获胜者,甚至是比赛号码获胜者”的意思,我会尽力给你一个答案。
如果这意味着比赛1和2配对的获胜者,然后是3和4等,那么您可以做一些简单的事情
df1['Team1'] = df.loc[::2, 'winner']
df1['Team2'] = df.loc[1::2, 'winner']
假设您的数据按显示顺序排序。您可以通过
df[df['model'] == 'ELO'].sort_values('Match_no')
等如果我说对了,pandas-groupby似乎并不需要。
答案 1 :(得分:1)
您可以使用GroupBy.cumcount
进行每组计数:
df1 = pd.DataFrame()
df1['Team1'] = df.loc[::2, 'winner'].values
df1['Team2'] = df.loc[1::2, 'winner'].values
df1['Match'] = df1['Team1'] + ' Vs '+ df1['Team2']
model = df.loc[::2, 'model'].values
df1['Match_no'] = df1.groupby(model).cumcount() + 1
df1['model'] = model
print (df1)
Team1 Team2 Match Match_no model
0 U B U Vs B 1 ELO
1 C X C Vs X 2 ELO
2 Y F Y Vs F 3 ELO
3 U B U Vs B 1 xG
4 C X C Vs X 2 xG
5 Y F Y Vs F 3 xG