我有一个记录汽车行驶速度的数据框。 “ id”是其汽车ID。数据框如下所示:
df = pd.DataFrame({'id':[1,1,1,1,1,1,1,1,1,1],
'speed':[10,0,0,20,20,15,0,0,0,10],
'time':['2020-01-17 18:43:29',
'2020-01-17 18:43:48',
'2020-01-17 18:44:09',
'2020-01-17 18:44:28',
'2020-01-17 18:44:48',
'2020-01-17 18:46:05',
'2020-01-17 18:47:15',
'2020-01-17 18:47:24',
'2020-01-17 18:53:07',
'2020-01-17 18:58:36']})
df['time']=pd.to_datetime(df['time'])
我想估计停止时间(速度= 0)。所以我首先这样做:
df['time_diff']=(df['time'].shift(-1)-df['time']).dt.seconds
现在,当'speed = 0'时,我想累加列'time_diff'。结果应如下所示:
[0, 40, 40, 0, 0, 0, 681, 681, 681, 0]
此问题的关键思想是,我们需要累加以获得连续的“速度= 0”。我确实检查了一些类似的答案,但是找不到一个好的解决方案。
答案 0 :(得分:2)
IIUC,请尝试:
InvocationTargetException / NullPointerException
c = df['speed'].eq(0) #condition
#calculation as per your question
s = (df['time'].shift(-1)-df['time']).dt.seconds
#check if series is immediate duplicate and groupby and sum
#then replace with 0 where c isn't met
s.groupby((c.ne(c.shift()).cumsum())).transform('sum').where(c,0)#.astype(int).tolist()