我有一个包含重复项的数据集:
#Count number of injuries by levels for each players
levelcount = df.groupby(['Relinquished','Severity']).size().reset_index(name='count')
levelcount['DTD'] = ''
levelcount['DNP'] = ''
levelcount['out indefinitely'] = ''
levelcount['ou t for season'] = ''
levelcount.head(4)
Relinquished Severity Count DTD DNP outindefinitely outforseason
0 player1 1 1
1 player1 3 1
2 player2 3 1
3 player3 1 3
我想以适当的顺序将计数值添加到其他数据框中:
1:DTD,
2:DNP,
3:无限期出局,
4:参加本赛季
我尝试使用if语句,但似乎无法取得突破。预先谢谢你!
if levelcount['Severity'] == 1:
df_extension['DTD'] = levelcount['']
if levelcount['Severity'] == 2:
df_extension['DNP'] = levelcount['']
if levelcount['Severity'] == 3:
df_extension['out indefinitely'] = levelcount['']
if levelcount['Severity'] == 4:
df_extension['out for season'] = levelcount['']
答案 0 :(得分:1)
将Series.map
与字典一起用于新列,由DataFrame.set_index
附加到索引,并由Series.unstack
整形:
levelcount = df.groupby(['Relinquished','Severity']).size().reset_index(name='count')
d = {1:'DTD',2:'DNP',3:'outindefinitely',4:'outforseason'}
new = levelcount.set_index(levelcount['Severity'].map(d), append=True)['Count'].unstack()
levelcount = levelcount.join(new.reindex(list(d.values()), axis=1))
print (levelcount)
Relinquished Severity Count DTD DNP outindefinitely outforseason
0 player1 1 1 1.0 NaN NaN NaN
1 player1 3 1 NaN NaN 1.0 NaN
2 player2 3 1 NaN NaN 1.0 NaN
3 player3 1 3 3.0 NaN NaN NaN
您的解决方案可以按字典循环并设置新列:
levelcount = df.groupby(['Relinquished','Severity']).size().reset_index(name='count')
d = {1:'DTD',2:'DNP',3:'outindefinitely',4:'outforseason'}
for k, v in d.items():
levecount = levelcount.loc[levelcount['Severity'] == k, v] = levelcount['count']