在每个gsm_id之前和之后添加新行

时间:2018-05-16 12:33:19

标签: python pandas

我的数据框有13列,900行。 我将其中一列设置为索引,这对于多个事件是相同的。 我想要做的是在现有第一行上方添加两个新行,包括该索引的13列,并复制当前列中的所有值。

我如何添加它。

我想在每个 gsm_id

的最后一行之后添加两个新行

enter image description here

在下图中,我想在第一行之前和最后一行之后添加一个新行。 gsm_id 设置为索引,我将在每个 gsm_id 之前和之后添加新行

我的预期输出将如下突出显示为红色。 enter image description here 谢谢 番

1 个答案:

答案 0 :(得分:1)

使用:

#create new column for last sorting
df['sort'] = df.groupby('gsm_id').cumcount() + 2

#get first 2 rows per each group
df1 = df.groupby('gsm_id').head(2).copy()

#modify values
df1[['PreviousEventTime','Goal_Flag','Union_level']] = np.nan
df1[['Run_score','Run_sum']] = 0
df1['Match_sta'] = 'Started'
#subtract for 0,1 values - first rows per groups
df1['sort'] -= 2
#print (df1)

#get last 2 rows per groups
df2 = df.groupby('gsm_id').tail(2).copy()
#change datetimes
df2['eventdatetime'] = df2['matchdatetime'] + pd.Timedelta(90, unit='m')
#add 2 for last 2 rows
df2['sort'] += 2
#print (df2)

#join all together and sort for correct ordering
df = (pd.concat([df1, df2, df])
        .sort_values(['gsm_id','sort'])
        .reset_index(drop=True)
        .drop('sort', axis=1))
print (df)
print (df)

     gsm_id    comp       ht         at team       matchdatetime  \
0   2462794   EngPr  Arsenal  Leicester    A 2017-08-11 18:45:00   
1   2462794   EngPr  Arsenal  Leicester    L 2017-08-11 18:45:00   
2   2462794   EngPr  Arsenal  Leicester    A 2017-08-11 18:45:00   
3   2462794   EngPr  Arsenal  Leicester    L 2017-08-11 18:45:00   
4   2462794   EngPr  Arsenal  Leicester    A 2017-08-11 18:45:00   
5   2462794   EngPr  Arsenal  Leicester    L 2017-08-11 18:45:00   
6   2462794   EngPr  Arsenal  Leicester    A 2017-08-11 18:45:00   
7   2462794   EngPr  Arsenal  Leicester    L 2017-08-11 18:45:00   
8   2462794   EngPr  Arsenal  Leicester    A 2017-08-11 18:45:00   
9   2462795  EngPr1  Arsenal  Leicester    A 2017-08-11 18:45:00   
10  2462795  EngPr1  Arsenal  Leicester    L 2017-08-11 18:45:00   
11  2462795  EngPr1  Arsenal  Leicester    A 2017-08-11 18:45:00   
12  2462795  EngPr1  Arsenal  Leicester    L 2017-08-11 18:45:00   
13  2462795  EngP1r  Arsenal  Leicester    A 2017-08-11 18:45:00   
14  2462795  EngP1r  Arsenal  Leicester    L 2017-08-11 18:45:00   
15  2462795  EngPr1  Arsenal  Leicester    A 2017-08-11 18:45:00   
16  2462795  EngP1r  Arsenal  Leicester    L 2017-08-11 18:45:00   
17  2462795  EngPr1  Arsenal  Leicester    A 2017-08-11 18:45:00   

         eventdatetime   PreviousEventTime   Goal_Flag Union_level Team_SR  \
0  2017-08-11 18:46:00                 NaT         NaN         NaN       A   
1  2017-08-11 18:49:00                 NaT         NaN         NaN       L   
2  2017-08-11 18:46:00 2017-08-11 18:45:00  First Goal      Scored       A   
3  2017-08-11 18:49:00 2017-08-11 18:46:00  First Goal    Conceded       L   
4  2017-08-11 19:13:00 2017-08-11 18:49:00  Other Goal      Scored       A   
5  2017-08-11 19:31:00 2017-08-11 19:13:00   Last Goal      Scored       A   
6  2017-08-11 19:40:00 2017-08-11 19:31:00   Last Goal    Conceded       L   
7  2017-08-11 20:15:00 2017-08-11 19:13:00   Last Goal      Scored       A   
8  2017-08-11 20:15:00 2017-08-11 19:31:00   Last Goal    Conceded       L   
9  2017-08-11 18:46:00                 NaT         NaN         NaN       A   
10 2017-08-11 18:49:00                 NaT         NaN         NaN       L   
11 2017-08-11 18:46:00 2017-08-11 18:45:00  First Goal      Scored       A   
12 2017-08-11 18:49:00 2017-08-11 18:46:00  First Goal    Conceded       L   
13 2017-08-11 19:13:00 2017-08-11 18:49:00  Other Goal      Scored       A   
14 2017-08-11 19:31:00 2017-08-11 19:13:00   Last Goal      Scored       A   
15 2017-08-11 19:40:00 2017-08-11 19:31:00   Last Goal    Conceded       L   
16 2017-08-11 20:15:00 2017-08-11 19:13:00   Last Goal      Scored       A   
17 2017-08-11 20:15:00 2017-08-11 19:31:00   Last Goal    Conceded       L   

    Run_score  Run_sum Match_sta  
0           0        0   Started  
1           0        0   Started  
2           1        1   Winning  
3          -1       -1    Losing  
4           1        1   Winning  
5           1        1   Winning  
6          -1       -1    Losing  
7           1        1   Winning  
8          -1       -1    Losing  
9           0        0   Started  
10          0        0   Started  
11          1        1   Winning  
12         -1       -1    Losing  
13          1        1   Winning  
14          1        1   Winning  
15         -1       -1    Losing  
16          1        1   Winning  
17         -1       -1    Losing  

示例数据:

c = ['gsm_id', 'comp', 'ht', 'at', 'team', 'matchdatetime','eventdatetime', 'PreviousEventTime', 'Goal_Flag', 'Union_level', 'Team_SR', 'Run_score', 'Run_sum', 'Match_sta']
df = pd.DataFrame({'Team_SR': ['A', 'L', 'A', 'A', 'L', 'A', 'L', 'A', 'A', 'L'], 
'team': ['A', 'L', 'A', 'L', 'A', 'A', 'L', 'A', 'L', 'A'], 
'matchdatetime': [pd.Timestamp('2017-08-11 18:45:00'), pd.Timestamp('2017-08-11 18:45:00'), pd.Timestamp('2017-08-11 18:45:00'), pd.Timestamp('2017-08-11 18:45:00'), pd.Timestamp('2017-08-11 18:45:00'), pd.Timestamp('2017-08-11 18:45:00'), pd.Timestamp('2017-08-11 18:45:00'), pd.Timestamp('2017-08-11 18:45:00'), pd.Timestamp('2017-08-11 18:45:00'), pd.Timestamp('2017-08-11 18:45:00')], 
'at': ['Leicester', 'Leicester', 'Leicester', 'Leicester', 'Leicester', 'Leicester', 'Leicester', 'Leicester', 'Leicester', 'Leicester'], 
'Union_level': ['Scored', 'Conceded', 'Scored', 'Scored', 'Conceded', 'Scored', 'Conceded', 'Scored', 'Scored', 'Conceded'], 
'Run_score': [1, -1, 1, 1, -1, 1, -1, 1, 1, -1], 
'eventdatetime': [pd.Timestamp('2017-08-11 18:46:00'), pd.Timestamp('2017-08-11 18:49:00'), pd.Timestamp('2017-08-11 19:13:00'), pd.Timestamp('2017-08-11 19:31:00'), pd.Timestamp('2017-08-11 19:40:00'), pd.Timestamp('2017-08-11 18:46:00'), pd.Timestamp('2017-08-11 18:49:00'), pd.Timestamp('2017-08-11 19:13:00'), pd.Timestamp('2017-08-11 19:31:00'), pd.Timestamp('2017-08-11 19:40:00')], 
'ht': ['Arsenal', 'Arsenal', 'Arsenal', 'Arsenal', 'Arsenal', 'Arsenal', 'Arsenal', 'Arsenal', 'Arsenal', 'Arsenal'], 
'Match_sta': ['Winning', 'Losing', 'Winning', 'Winning', 'Losing', 'Winning', 'Losing', 'Winning', 'Winning', 'Losing'], 
'gsm_id': [2462794, 2462794, 2462794, 2462794, 2462794, 2462795, 2462795, 2462795, 2462795, 2462795],
'Goal_Flag': ['First Goal', 'First Goal', 'Other Goal', 'Last Goal', 'Last Goal', 'First Goal', 'First Goal', 'Other Goal', 'Last Goal', 'Last Goal'], 'Run_sum': [1, -1, 1, 1, -1, 1, -1, 1, 1, -1], 
'comp': ['EngPr', 'EngPr', 'EngPr', 'EngPr', 'EngPr', 'EngPr1', 'EngPr1', 'EngP1r', 'EngP1r', 'EngPr1'], 'PreviousEventTime': [pd.Timestamp('2017-08-11 18:45:00'), pd.Timestamp('2017-08-11 18:46:00'), pd.Timestamp('2017-08-11 18:49:00'), pd.Timestamp('2017-08-11 19:13:00'), pd.Timestamp('2017-08-11 19:31:00'), pd.Timestamp('2017-08-11 18:45:00'), pd.Timestamp('2017-08-11 18:46:00'), pd.Timestamp('2017-08-11 18:49:00'), pd.Timestamp('2017-08-11 19:13:00'), pd.Timestamp('2017-08-11 19:31:00')]}, columns=c)