completed deadline
15-07-2013 23:10 15-07-2013 23:15
16-07-2013 00:20 16-07-2013 00:15
16-07-2013 00:20 16-07-2013 00:15
16-07-2013 21:04 16-07-2013 21:30
16-07-2013 21:58 16-07-2013 22:00
16-07-2013 23:21 16-07-2013 23:15
16-07-2013 23:21 16-07-2013 23:15
17-07-2013 00:19 17-07-2013 00:15
17-07-2013 00:19 17-07-2013 00:15
17-07-2013 21:18 17-07-2013 21:30
17-07-2013 22:07 17-07-2013 22:00
当我说data['completed'] - data['deadline']
时,我得到了;
-1 day, 23:55:00 # on time
0:05:00
0:05:00
-1 day, 23:34:00 # on time
-1 day, 23:58:00 # on time
0:06:00
0:06:00
0:04:00
0:04:00
-1 day, 23:48:00 # on time
0:07:00
但当我data['time_delay'] = data['completed'] - data['deadline']
并打印data['time_delay']
时,我得到了;
-300000000000
300000000000
300000000000
-1560000000000
-120000000000
360000000000
360000000000
240000000000
240000000000
-720000000000
420000000000
当输出打印到csv时,我得到相同的结果。
我如何:
处理此输出?
以'分钟'格式将输出打印到csv?
处理'准时'输出?
答案 0 :(得分:2)
>>> data = pd.read_csv('1.csv', parse_dates=[0,1])
>>> data['time_delay'] = data['completed'] - data['deadline']
>>> print data['time_delay']
0 -00:05:00
1 00:05:00
2 00:05:00
3 -00:26:00
4 -00:02:00
Name: time_delay, dtype: timedelta64[ns]
>>> data.to_csv(sys.stdout)
,completed,deadline,time_delay
0,2013-07-15 23:10:00,2013-07-15 23:15:00,-300000000000
1,2013-07-16 00:20:00,2013-07-16 00:15:00,300000000000
2,2013-07-16 00:20:00,2013-07-16 00:15:00,300000000000
3,2013-07-16 21:04:00,2013-07-16 21:30:00,-1560000000000
4,2013-07-16 21:58:00,2013-07-16 22:00:00,-120000000000
>>> data['time_delay'] = data['time_delay'].apply(pd.lib.repr_timedelta64)
>>> data.to_csv(sys.stdout)
,completed,deadline,time_delay
0,2013-07-15 23:10:00,2013-07-15 23:15:00,-00:05:00
1,2013-07-16 00:20:00,2013-07-16 00:15:00,00:05:00
2,2013-07-16 00:20:00,2013-07-16 00:15:00,00:05:00
3,2013-07-16 21:04:00,2013-07-16 21:30:00,-00:26:00
4,2013-07-16 21:58:00,2013-07-16 22:00:00,-00:02:00
pandas.lib.repr_timedelta64
没有记录。所以这段代码将来会破裂。
(我用过pandas 0.11.0)
答案 1 :(得分:1)
试试这个:
def func(x,y):
if x > y:
return 'delayed by ' + str( ((x-y).seconds//60)%60) + ' minutes'
else:
return 'on time by ' + str( ((y-x).seconds//60)%60) + ' minutes'
data["ontime"] = data.apply(lambda row: func(row["completed"], row["deadline"]), axis=1)
这给出了:
completed deadline ontime
0 2013-07-15 23:10:00 2013-07-15 23:15:00 on time by 5 minutes
1 2013-07-16 00:20:00 2013-07-16 00:15:00 delayed by 5 minutes
2 2013-07-16 00:20:00 2013-07-16 00:15:00 delayed by 5 minutes
3 2013-07-16 21:04:00 2013-07-16 21:30:00 on time by 26 minutes
4 2013-07-16 21:58:00 2013-07-16 22:00:00 on time by 2 minutes
5 2013-07-16 23:21:00 2013-07-16 23:15:00 delayed by 6 minutes
6 2013-07-16 23:21:00 2013-07-16 23:15:00 delayed by 6 minutes
7 2013-07-17 00:19:00 2013-07-17 00:15:00 delayed by 4 minutes
8 2013-07-17 00:19:00 2013-07-17 00:15:00 delayed by 4 minutes
9 2013-07-17 21:18:00 2013-07-17 21:30:00 on time by 12 minutes
10 2013-07-17 22:07:00 2013-07-17 22:00:00 delayed by 7 minutes