以下数据框“Roadtrip”与A点和B点之间的车辆行驶时间有关。使用Pandas数据帧,如何计算新列“TravelTime”以捕获从A点到点A的行驶所需的分钟数B(即假设“离开”和“到达”的条目是字符串)?
输出:
Leave Arrive TravelTime(in minutes)
0 18:26 21:16 ????
1 12:18 14:19 ????
2 06:23 13:02 ????
3 15:52 03:14 ????
答案 0 :(得分:0)
由于您只提供时间数据,我假设休假时间较晚,到达时间为第二天到达
df1=df.apply(pd.to_datetime)
df['New']=np.where((df1.Arrive-df1.Leave).dt.total_seconds()//60<0,((df1.Arrive+pd.Timedelta(1,unit='d'))-df1.Leave).dt.total_seconds()//60,(df1.Arrive-df1.Leave).dt.total_seconds()//60)
df
Out[1491]:
Leave Arrive New
0 18:26 21:16 170.0
1 12:18 14:19 121.0
2 06:23 13:02 399.0
3 15:52 03:14 682.0
答案 1 :(得分:0)
我认为需要:
#convert both columns to timedeltas
a = pd.to_timedelta(df['Arrive'] + ':00')
b = pd.to_timedelta(df['Leave'] + ':00')
#one day timedelta
oneday = np.array([1], dtype='timedelta64[D]')
#zero timedelta
zero = np.array(0, dtype='timedelta64[D]')
#compare if difference is negative
mask = (a - b) > zero
#by condition get difference, add one day if next day, last convert to minutes
df['TravelTime'] = ((np.where(mask, a - b, a + oneday - b) /
np.timedelta64(1, 'm')).astype(int))
print (df)
Leave Arrive TravelTime
0 18:26 21:16 170
1 12:18 14:19 121
2 06:23 13:02 399
3 15:52 03:14 682