假设我有一系列的pandas.tseries.index.DatetimeIndex,这基本上是2016年的工作日。有没有简单/优雅的方式来查找系列中几天的顺序差异?就像.diff()对整数或浮点DataFrame列一样。
import pandas as pd
import numpy as np
ds = pd.date_range("2016-01-01","2016-12-31",freq='B')
# I was hoping for something like this:
ds.diff().days
# this gives me what I want, but it is ugly and unintuitive
np.diff(ds) / 86400000000000
我也考虑过np.diff(ds.date)
,但它给了我一个datetime.timedelta的ndarray,我不知道如何将它转换为没有for循环的整数数组/系列。
答案 0 :(得分:4)
试试这个:
In [154]: ds.to_series().diff()
Out[154]:
2016-01-01 NaT
2016-01-04 3 days
2016-01-05 1 days
2016-01-06 1 days
2016-01-07 1 days
2016-01-08 1 days
2016-01-11 3 days
2016-01-12 1 days
2016-01-13 1 days
2016-01-14 1 days
2016-01-15 1 days
2016-01-18 3 days
2016-01-19 1 days
2016-01-20 1 days
2016-01-21 1 days
2016-01-22 1 days
2016-01-25 3 days
2016-01-26 1 days
2016-01-27 1 days
2016-01-28 1 days
2016-01-29 1 days
2016-02-01 3 days
2016-02-02 1 days
2016-02-03 1 days
2016-02-04 1 days
...
答案 1 :(得分:2)
我认为如果需要数字输出,我需要添加Index.to_series
然后添加Series.dt.days
:
print (ds.to_series().diff().dt.days)
2016-01-01 NaN
2016-01-04 3.0
2016-01-05 1.0
2016-01-06 1.0
2016-01-07 1.0
2016-01-08 1.0
2016-01-11 3.0
2016-01-12 1.0
2016-01-13 1.0
2016-01-14 1.0
2016-01-15 1.0
2016-01-18 3.0
2016-01-19 1.0
2016-01-20 1.0
2016-01-21 1.0
2016-01-22 1.0
2016-01-25 3.0
2016-01-26 1.0
2016-01-27 1.0
2016-01-28 1.0
2016-01-29 1.0
2016-02-01 3.0
2016-02-02 1.0
2016-02-03 1.0
2016-02-04 1.0
2016-02-05 1.0
2016-02-08 3.0
2016-02-09 1.0
2016-02-10 1.0
2016-02-11 1.0
2016-11-21 3.0
2016-11-22 1.0
2016-11-23 1.0
2016-11-24 1.0
2016-11-25 1.0
2016-11-28 3.0
2016-11-29 1.0
2016-11-30 1.0
2016-12-01 1.0
2016-12-02 1.0
2016-12-05 3.0
2016-12-06 1.0
2016-12-07 1.0
2016-12-08 1.0
2016-12-09 1.0
2016-12-12 3.0
2016-12-13 1.0
2016-12-14 1.0
2016-12-15 1.0
2016-12-16 1.0
2016-12-19 3.0
2016-12-20 1.0
2016-12-21 1.0
2016-12-22 1.0
2016-12-23 1.0
2016-12-26 3.0
2016-12-27 1.0
2016-12-28 1.0
2016-12-29 1.0
2016-12-30 1.0
Freq: B, dtype: float64