我有一个如下所示的数据集:
batsman batting_team 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018
0 A Ashish Reddy Deccan Chargers 0 0 0 0 35 0 0 0 0 0 0
1 A Ashish Reddy Sunrisers Hyderabad 0 0 0 0 0 125 0 73 47 0 0
2 A Chandila Rajasthan Royals 0 0 0 0 0 4 0 0 0 0 0
3 A Chopra Kolkata Knight Riders 42 11 0 0 0 0 0 0 0 0 0
4 A Choudhary Royal Challengers Bangalore 0 0 0 0 0 0 0 0 0 25 0
我正在尝试添加每年与他们的分数相同名称的列,例如,如果“准备就绪”出现两次,这意味着,
我只是想添加创建其他内容,但无法到达任何地方。
我们从这两个观察值中创建一个,如下所示
名称-Reddy
团队-第二观察队名称
2008,2009,...,2018-并从year列中添加列数据。
答案 0 :(得分:2)
尝试:
df_out = df.groupby('batsman').sum()
#Sums all numeric columns of the dataframe
df_out['batting_team'] = df_out.index.map(df.drop_duplicates(['batsman'], keep='last').set_index('batsman')['batting_team'])
#Use drop duplicates to keep the last team and set_index to use in map
df_out.reset_index().reindex(df.columns, axis=1)
#Reset index and reorder dataframe columns like input dataframe
输出:
batsman batting_team 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018
0 A Ashish Reddy Sunrisers Hyderabad 0 0 0 0 35 125 0 73 47 0 0
1 A Chandila Rajasthan Royals 0 0 0 0 0 4 0 0 0 0 0
2 A Chopra Kolkata Knight Riders 42 11 0 0 0 0 0 0 0 0 0
3 A Choudhary Royal Challengers Bangalore 0 0 0 0 0 0 0 0 0 25 0