unmelt pandas dataframe并为重复列创建新列

时间:2018-01-05 09:38:01

标签: python pandas dataframe

我有一个数据集,包括各个赛季的运动队表现,我从中创建了一个分离主队和客队的融合数据集。

原始数据集:

  Div      Date  HomeTeam   AwayTeam  FTHG  FTAG FTR  HTHG  HTAG HTR   Referee
0  E0  11/08/17   Arsenal  Leicester     4     3   H     2     2   D    M Dean
1  E0  12/08/17  Brighton   Man City     0     2   A     0     0   D  M Oliver
2  E0  12/08/17   Chelsea    Burnley     2     3   A     0     3   A  C Pawson

融化数据集:

  Div        Date        HomeTeam      AwayTeam  FTHG  FTAG FTR  HTHG  HTAG  \
0  E0  2017-08-11         Arsenal     Leicester     4     3   H     2     2   
1  E0  2017-08-11         Arsenal     Leicester     4     3   H     2     2   
2  E0  2017-08-12         Watford     Liverpool     3     3   D     2     1   
3  E0  2017-08-12       West Brom   Bournemouth     1     0   H     1     0   
4  E0  2017-08-12  Crystal Palace  Huddersfield     0     3   A     0     2   

  HTR   Referee  Home/Away          Team        Opponent  
0   D    M Dean          1       Arsenal       Leicester  
1   D    M Dean          0     Leicester         Arsenal  
2   H  A Taylor          0     Liverpool         Watford  
3   H  R Madley          0   Bournemouth       West Brom  
4   A    J Moss          0  Huddersfield  Crystal Palace  

我还添加了额外的列来计算累积目标/诸如此类。实际上,它看起来像

         Date  Home/Away          Team        Opponent  Cumg  Cumc  Result  \
0  2017-08-11          1       Arsenal       Leicester   0.0   0.0       1   
1  2017-08-11          0     Leicester         Arsenal   0.0   0.0       1   
2  2017-08-12          0     Liverpool         Watford   0.0   0.0       2   
3  2017-08-12          0   Bournemouth       West Brom   0.0   0.0       1   
4  2017-08-12          0  Huddersfield  Crystal Palace   0.0   0.0       0   

   Cumw  Cuml  Cumd  Cumtr  win_streak  lose_streak  
0   0.0   0.0   0.0    0.0         0.0          0.0  
1   0.0   0.0   0.0    0.0         0.0          0.0  
2   0.0   0.0   0.0    0.0         0.0          0.0  
3   0.0   0.0   0.0    0.0         0.0          0.0  
4   0.0   0.0   0.0    0.0         0.0          0.0  

我想" unmelt"返回原始格式的数据集,但保留我添加的新列。喜欢这个

         Date  Home/Away     Team     Opponent  Cumg_team  Cumc_team  Result  \
0  2017-08-11          1  Arsenal    Leicester        0.0        0.0       1   
1  2017-08-19          0  Arsenal        Stoke        3.0        4.0       1   
2  2017-08-27          0  Arsenal    Liverpool        3.0        5.0       1   
3  2017-09-09          1  Arsenal  Bournemouth        3.0        9.0       1   
4  2017-09-17          0  Arsenal      Chelsea        3.0       12.0       2  

   Cumw_team  Cuml_team  Cumd  Cumtr_team  win_streak_team  lose_streak_team  \
0        0.0        0.0   0.0         0.0              0.0               0.0   
1        0.0        1.0   0.0         0.0              0.0               0.0   
2        0.0        2.0   0.0         0.0              1.0               0.0   
3        0.0        3.0   0.0         0.0              0.0               0.0   
4        0.0        4.0   0.0         0.0              0.0               0.0  

   Cumw_opponent  Cuml_opponent  Cumg_opponent  Cumc_opponent  Cumtr_opponent  \
0            0.0            0.0            0.0            0.0             0.0   
1            0.0            1.0            0.0            1.0             0.0   
2            0.0            1.0            3.0            4.0             1.0   
3            2.0            1.0            4.0            2.0             6.0   
4            3.0            1.0            7.0            6.0             9.0  

   win_streak_opponent  lose_streak_opponent  
0                  0.0                   0.0  
1                  0.0                   0.0  
2                  0.0                   0.0  
3                  0.0                   0.0  
4                  0.0                   0.0

我可以通过

来做到这一点
df1 = df[df['Team']=='Arsenal'].set_index('Date')
df2 = df[df['Opponent']=='Arsenal].set_index('Date')

df3 = df1.join(df2).reset_index()

但那不是很有效,我想知道无论如何我都可以用纯粹的类似SQL的操纵pandas df来做到这一点吗?

1 个答案:

答案 0 :(得分:1)

似乎你需要:

 @Multipart
    @Headers({"Accept: application/json"})
    @POST("api/save")
    Call<SaveResponse> save(@Header("Authorization") String authorization,
                                       @PartMap Map<String, RequestBody> map,
                                       @Part List<MultipartBody.Part> files);