我想用Activity_station的值填充None值。 数据如下,我创建了一些列来简化条件。
Shift_id activity_name activity_id activity_begin_time activity_end_time activity_station shift code day
0 123 start D01-MCK-DI 09:00 09:05 None D01 MCK DI
1 123 work D01-MCK-DI 09:05 12:00 Za D01 MCK DI
2 123 drive D01-MCK-DI 12:00 12:30 Ro D01 MCK DI
3 184 start D01-MV-DI 09:00 09:05 None D01 MV DI
4 184 work D01-MV-DI 09:05 12:00 Ca D01 MV DI
5 184 drive D01-MV-DI 12:00 12:30 None D01 MV DI
根据需要加载数据
df = pd.DataFrame({
'Shift_id' :[ 123,123,123,184,184,184],
'activity_name':['start','work','drive','start','work','drive'],
'activity_id' : ['D01-MCK-DI','D01-MCK-DI','D01-MCK-DI','D01-MV-DI','D01-MV-DI','D01-MV-DI'],
'activity_begin_time' : ['09:00','09:05','12:00','09:00','09:05','12:00'],
'activity_end_time' : ['09:05','12:00','12:30','09:05','12:00','12:30'],
'activity_station' : ['None', 'Za','Ro','None', 'Ca','None']})
df[['shift','code','day']] = df['activity_id'].str.split(pat="-", expand=True)
如果MV在activity_station列上的值为None
然后查看MV和MCK的偏移和日期相同的地方,并将MCK的acitivity_station值分配为MV的None值
我尝试了一些if else return语句,但毕竟没有成功。
结果应如下所示:
Shift_id activity_name activity_id activity_begin_time activity_end_time activity_station shift code day
0 123 start D01-MCK-DI 09:00 09:05 None D01 MCK DI
1 123 work D01-MCK-DI 09:05 12:00 Za D01 MCK DI
2 123 drive D01-MCK-DI 12:00 12:30 Ro D01 MCK DI
3 184 start D01-MV-DI 09:00 09:05 None D01 MV DI
4 184 work D01-MV-DI 09:05 12:00 Ca D01 MV DI
5 184 drive D01-MV-DI 12:00 12:30 Ro D01 MV DI
答案 0 :(得分:0)
IIUC,您还需要一个分组列才能获得所需的输出。您当前正在描述按shift
和day
进行分组,但是这仍然只产生一个分组,因此我假设您也打算按activity_name
进行分组。如果是这种情况,则可以在将数据框中的transform()
值替换为None
(即np.nan
)之后使用NaN
:
df['activity_station'] = df.groupby(['shift','day','activity_name'])['activity_station'].transform(lambda x: x.ffill())
这将产生您想要的输出:
Shift_id activity_name activity_id activity_begin_time activity_end_time \
0 123 start D01-MCK-DI 09:00 09:05
1 123 work D01-MCK-DI 09:05 12:00
2 123 drive D01-MCK-DI 12:00 12:30
3 184 start D01-MV-DI 09:00 09:05
4 184 work D01-MV-DI 09:05 12:00
5 184 drive D01-MV-DI 12:00 12:30
activity_station shift code day
0 NaN D01 MCK DI
1 Za D01 MCK DI
2 Ro D01 MCK DI
3 NaN D01 MV DI
4 Ca D01 MV DI
5 Ro D01 MV DI