使用datetime64 [ns,UTC]字段的Pandas Timedelta错误

时间:2018-05-23 22:15:55

标签: python python-2.7 pandas time-series

我有一个数据框,下面显示了两个时间字段。当我尝试在它们之间使用Timedelta时,我收到下面的错误消息。我已经包含了字段的.info()。有谁看到问题是什么,你能建议如何解决它?任何提示都非常感激。

Data:

                            et_utc                    ts_utc
0 2018-05-02 09:24:29.304000+00:00 2018-05-02 09:39:15+00:00
1 2018-05-02 09:26:12.132000+00:00 2018-05-02 09:39:15+00:00
2 2018-05-02 09:28:37.913000+00:00 2018-05-02 09:39:12+00:00
3 2018-05-02 09:28:37.913000+00:00 2018-05-02 09:28:49+00:00
4 2018-05-02 10:39:48.820000+00:00 2018-05-02 10:39:48+00:00


Data description:

df[[‘et_utc','ts_utc']].info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 25625 entries, 0 to 25624
Data columns (total 2 columns):
et_utc    25625 non-null datetime64[ns, UTC]
ts_utc    25625 non-null datetime64[ns, UTC]
dtypes: datetime64[ns, UTC](2)
memory usage: 1.8 MB


Code:

df['t_delta']=pd.Timedelta(df['et_utc'] - df['ts_utc']).seconds


Error:

ValueError: Value must be Timedelta, string, integer, float, timedelta or convertible

2 个答案:

答案 0 :(得分:2)

使用Series dt accessor可以访问日期时间(timedelta)系列的方法和属性。

>>> (df['et_utc'] - df['ts_utc']).dt.total_seconds()
0   -885.696
1   -782.868
2   -634.087
3    -11.087
4      0.820
dtype: float64
>>> df['t_delta'] = (df['et_utc'] - df['ts_utc']).dt.total_seconds()
>>>
>>> print(df)
                   et_utc              ts_utc  t_delta
0 2018-05-02 09:24:29.304 2018-05-02 09:39:15 -885.696
1 2018-05-02 09:26:12.132 2018-05-02 09:39:15 -782.868
2 2018-05-02 09:28:37.913 2018-05-02 09:39:12 -634.087
3 2018-05-02 09:28:37.913 2018-05-02 09:28:49  -11.087
4 2018-05-02 10:39:48.820 2018-05-02 10:39:48    0.820
>>> 

Datetimelike properties

astyping 等同于分区,它会失去一点分辨率。

>>> (df['et_utc'] - df['ts_utc']).astype('timedelta64[s]')
0   -886.0
1   -783.0
2   -635.0
3    -12.0
4      0.0
dtype: float64
>>> 

答案 1 :(得分:1)

如果你想在几秒钟内得到timedelta,根据official documentation,你可以这样做:

df['t_delta']=(df['et_utc'] - df['ts_utc']).astype('timedelta64[s]')