我有一个Pandas时间序列,如下所示:
In [1]: ser1
Out[1]:
Date
2005-12-31 11382000
Name: Amount, dtype: float64
我想使用另一个时间序列的索引重新编制索引:
In [2]: ser2
Out[2]:
Date
2005-12-20 14.13
2005-12-21 14.22
2005-12-22 14.30
2005-12-23 14.35
2005-12-27 14.32
2005-12-28 14.32
2005-12-29 14.23
2005-12-30 14.19
2006-01-03 14.48
2006-01-04 14.54
2006-01-05 14.68
Name: Amount, dtype: float64
但是当我使用
时ser3 = ser1.reindex(ser2.index)
我得到了
In [4]: ser3
Out[4]:
Date
2005-12-20 NaN
2005-12-21 NaN
2005-12-22 NaN
2005-12-23 NaN
2005-12-27 NaN
2005-12-28 NaN
2005-12-29 NaN
2005-12-30 NaN
2006-01-03 NaN
2006-01-04 NaN
2006-01-05 NaN
Name: Amount, dtype: float64
请注意,来自ser1的项目的日期为' 2005-12-31'没有出现在ser3中,因为ser2的索引不包括2005-12-31。我想将ser1的值放在ser2索引的下一个可用日期。我怎么能这样做?
答案 0 :(得分:3)
以下内容将允许您填写最近的转发日期(如果其为nan
(否则将采用该索引处的值)。 (如果您想要最近的落后日期,可以使用方法bfill
)。 IIRC这仍然是大熊猫的一个悬而未决的问题,因为它有点不平凡(理论上应该是一种填充方法,例如“最接近”),但需要公关!
In [25]: ser1 = Series(100000,index=[Timestamp('20051231')])
In [26]: ser1
Out[26]:
2005-12-31 100000
dtype: int64
In [27]: ser2
Out[27]:
0
2005-12-20 14.13
2005-12-21 14.22
2005-12-22 14.30
2005-12-23 14.35
2005-12-27 14.32
2005-12-28 14.32
2005-12-29 14.23
2005-12-30 14.19
2006-01-03 14.48
2006-01-04 14.54
2006-01-05 14.68
Name: 1, dtype: float64
In [28]: ser1.reindex(ser2.index,method='ffill',limit=1)
Out[28]:
0
2005-12-20 NaN
2005-12-21 NaN
2005-12-22 NaN
2005-12-23 NaN
2005-12-27 NaN
2005-12-28 NaN
2005-12-29 NaN
2005-12-30 NaN
2006-01-03 100000
2006-01-04 NaN
2006-01-05 NaN
dtype: float64