我是python的初学者。
我有一个庞大的dataframe
。数据如下:
df
ID Annotation Time
A Boarding 7:20:00
A Alighting 8:30:50
B Boarding 13:45:00
B Alighting 14:00:05
C Boarding 17:05:00
C Alighting 17:15:00
我想为每个ID计算登机与下车之间的旅行时间。我的预期结果如下所示: 结果
ID Time Boarding Time Alighting Travel Time (Minutes)
A 7:20:00 8:30:50 70.83
B 13:45:00 14:00:05 15.08
C 17:05:00 17:15:00 10.00
我需要建议。预先谢谢你。
答案 0 :(得分:1)
确实是pivot
的情况:
# change to datetime
df['Time'] = pd.to_datetime(df['Time'])
new_df = df.pivot(index='ID', columns='Annotation', values='Time')
s = (new_df['Alighting'] - new_df['Boarding'])
new_df['Travel Time'] = s.dt.seconds / 60
输出:
Annotation Alighting Boarding Travel Time
ID
A 2019-07-04 08:30:50 2019-07-04 07:20:00 70.833333
B 2019-07-04 14:00:05 2019-07-04 13:45:00 15.083333
C 2019-07-04 17:15:00 2019-07-04 17:05:00 10.000000
答案 1 :(得分:1)
没有枢轴的解决方案:
>>> df2 = pd.DataFrame({'Time %s' % i: pd.to_datetime(pd.Series(x.values.ravel()))
for i, x in df.iloc[:, 1:].set_index('Annotation').T.groupby(level=0, axis=1)})
>>> df2['ID'] = df['ID'].unique()
>>> df2['Travel Time (Minutes)'] = (df2['Time Alighting'] - df2['Time Boarding']).dt.seconds / 60
>>> df2 = df2[['ID', 'Time Boarding', 'Time Alighting', 'Travel Time (Minutes)']]
>>> df2
ID Time Boarding Time Alighting Travel Time (Minutes)
0 A 2019-07-04 07:20:00 2019-07-04 08:30:50 70.833333
1 B 2019-07-04 13:45:00 2019-07-04 14:00:05 15.083333
2 C 2019-07-04 17:05:00 2019-07-04 17:15:00 10.000000
>>>