重新索引Pandas系列找到最近的日期

时间:2013-12-11 23:50:34

标签: python pandas

我有一个Pandas时间序列,如下所示:

In [1]: ser1
Out[1]: 
Date
2005-12-31    11382000
Name: Amount, dtype: float64

我想使用另一个时间序列的索引重新编制索引:

In [2]: ser2
Out[2]: 
Date
2005-12-20    14.13
2005-12-21    14.22
2005-12-22    14.30
2005-12-23    14.35
2005-12-27    14.32
2005-12-28    14.32
2005-12-29    14.23
2005-12-30    14.19
2006-01-03    14.48
2006-01-04    14.54
2006-01-05    14.68
Name: Amount, dtype: float64

但是当我使用

ser3 = ser1.reindex(ser2.index)

我得到了

In [4]: ser3
Out[4]: 
Date
2005-12-20   NaN
2005-12-21   NaN
2005-12-22   NaN
2005-12-23   NaN
2005-12-27   NaN
2005-12-28   NaN
2005-12-29   NaN
2005-12-30   NaN
2006-01-03   NaN
2006-01-04   NaN
2006-01-05   NaN
Name: Amount, dtype: float64

请注意,来自ser1的项目的日期为' 2005-12-31'没有出现在ser3中,因为ser2的索引不包括2005-12-31。我想将ser1的值放在ser2索引的下一个可用日期。我怎么能这样做?

1 个答案:

答案 0 :(得分:3)

以下内容将允许您填写最近的转发日期(如果其为nan(否则将采用该索引处的值)。 (如果您想要最近的落后日期,可以使用方法bfill)。 IIRC这仍然是大熊猫的一个悬而未决的问题,因为它有点不平凡(理论上应该是一种填充方法,例如“最接近”),但需要公关!

In [25]: ser1 = Series(100000,index=[Timestamp('20051231')])

In [26]: ser1
Out[26]: 
2005-12-31    100000
dtype: int64

In [27]: ser2
Out[27]: 
0
2005-12-20    14.13
2005-12-21    14.22
2005-12-22    14.30
2005-12-23    14.35
2005-12-27    14.32
2005-12-28    14.32
2005-12-29    14.23
2005-12-30    14.19
2006-01-03    14.48
2006-01-04    14.54
2006-01-05    14.68
Name: 1, dtype: float64

In [28]: ser1.reindex(ser2.index,method='ffill',limit=1)
Out[28]: 
0
2005-12-20       NaN
2005-12-21       NaN
2005-12-22       NaN
2005-12-23       NaN
2005-12-27       NaN
2005-12-28       NaN
2005-12-29       NaN
2005-12-30       NaN
2006-01-03    100000
2006-01-04       NaN
2006-01-05       NaN
dtype: float64