如何在几分钟(浮动)中转换(可能为负)Pandas TimeDelta?

时间:2016-08-09 15:16:07

标签: python datetime pandas

我有一个像这样的数据框

df[['timestamp_utc','minute_ts','delta']].head()
Out[47]: 
            timestamp_utc           minute_ts                    delta
0 2015-05-21 14:06:33.414 2015-05-21 12:06:00 -1 days +21:59:26.586000
1 2015-05-21 14:06:33.414 2015-05-21 12:07:00 -1 days +22:00:26.586000
2 2015-05-21 14:06:33.414 2015-05-21 12:08:00 -1 days +22:01:26.586000
3 2015-05-21 14:06:33.414 2015-05-21 12:09:00 -1 days +22:02:26.586000
4 2015-05-21 14:06:33.414 2015-05-21 12:10:00 -1 days +22:03:26.586000

df['delta']=df.minute_ts-df.timestamp_utc

timestamp_utc     datetime64[ns]
minute_ts         datetime64[ns]
delta            timedelta64[ns]

问题是,我想在timestamp_utcminutes_ts之间获得(可能为负)分钟数,而忽略秒组件。

因此,对于第一行,我想获得-120。实际上,2015-05-21 12:06:002015-05-21 14:06:33.414之前的120分钟。

做最棒的熊猫方式是什么?

非常感谢!

2 个答案:

答案 0 :(得分:3)

您可以使用:

df['a'] = df['delta'] / np.timedelta64(1, 'm')
print (df)
            timestamp_utc           minute_ts                    delta  \
0 2015-05-21 14:06:33.414 2015-05-21 12:06:00 -1 days +21:59:26.586000   
1 2015-05-21 14:06:33.414 2015-05-21 12:07:00 -1 days +22:00:26.586000   
2 2015-05-21 14:06:33.414 2015-05-21 12:08:00 -1 days +22:01:26.586000   
3 2015-05-21 14:06:33.414 2015-05-21 12:09:00 -1 days +22:02:26.586000   
4 2015-05-21 14:06:33.414 2015-05-21 12:10:00 -1 days +22:03:26.586000   

          a  
0 -120.5569  
1 -119.5569  
2 -118.5569  
3 -117.5569  
4 -116.5569  

然后将float转换为int

df['a'] = (df['delta'] / np.timedelta64(1, 'm')).astype(int)
print (df)
            timestamp_utc           minute_ts                    delta    a
0 2015-05-21 14:06:33.414 2015-05-21 12:06:00 -1 days +21:59:26.586000 -120
1 2015-05-21 14:06:33.414 2015-05-21 12:07:00 -1 days +22:00:26.586000 -119
2 2015-05-21 14:06:33.414 2015-05-21 12:08:00 -1 days +22:01:26.586000 -118
3 2015-05-21 14:06:33.414 2015-05-21 12:09:00 -1 days +22:02:26.586000 -117
4 2015-05-21 14:06:33.414 2015-05-21 12:10:00 -1 days +22:03:26.586000 -116

答案 1 :(得分:1)

您可以在Pandas中使用Timedelta object,然后在列表推导中使用floor division来计算分钟数。请注意,Timedelta的秒属性会返回秒数(> = 0且小于1天),因此您必须将天数显式转换为相应的分钟数。

df = pd.DataFrame({'minute_ts': [pd.Timestamp('2015-05-21 12:06:00'), 
                                 pd.Timestamp('2015-05-21 12:07:00'), 
                                 pd.Timestamp('2015-05-21 12:08:00'), 
                                 pd.Timestamp('2015-05-21 12:09:00'), 
                                 pd.Timestamp('2015-05-21 12:10:00')], 
                   'timestamp_utc': [pd.Timestamp('2015-05-21 14:06:33.414')] * 5})

df['minutes_neg'] = [td.days * 24 * 60 + td.seconds//60 
                 for td in [pd.Timedelta(delta) 
                            for delta in df.minute_ts - df.timestamp_utc]]

df['minutes_pos'] = [td.days * 24 * 60 + td.seconds//60 
                 for td in [pd.Timedelta(delta) 
                            for delta in df.timestamp_utc - df.minute_ts]]

>>> df
            minute_ts           timestamp_utc  minutes_neg  minutes_pos
0 2015-05-21 12:06:00 2015-05-21 14:06:33.414         -121          120
1 2015-05-21 12:07:00 2015-05-21 14:06:33.414         -120          119
2 2015-05-21 12:08:00 2015-05-21 14:06:33.414         -119          118
3 2015-05-21 12:09:00 2015-05-21 14:06:33.414         -118          117
4 2015-05-21 12:10:00 2015-05-21 14:06:33.414         -117          116

请注意,因为楼层划分,会议记录因人而异。例如,90 // 60 = 1,但-90 // 60 = -2。如果结果是负数,你可以在结果中添加一个,但是只有一分钟的边缘情况(以毫秒精度测量)会偏离一分钟。