我有一个包含两列的数据框,如下所示:
date value
0 2017-05-01 1
1 2017-05-08 4
2 2017-05-15 9
每一行显示一周的星期一,我只有该特定日期的值。我想估计直到下周一的整个工作日的这个值,并获得以下输出:
date value
0 2017-05-01 1
1 2017-05-02 1
2 2017-05-03 1
3 2017-05-04 1
4 2017-05-05 1
5 2017-05-06 1
6 2017-05-07 1
7 2017-05-08 4
8 2017-05-09 4
9 2017-05-10 4
10 2017-05-11 4
11 2017-05-12 4
12 2017-05-13 4
13 2017-05-14 4
14 2017-05-15 9
15 2017-05-16 9
16 2017-05-17 9
17 2017-05-18 9
18 2017-05-19 9
19 2017-05-20 9
20 2017-05-21 9
在this link中,它显示了如何在Dataframe中选择范围,但我不知道如何解释value
列。
答案 0 :(得分:2)
以下是使用pandas reindex和ffill的解决方案:
# Make sure dates is treated as datetime
df['date'] = pd.to_datetime(df['date'], format = "%Y-%m-%d")
from pandas.tseries.offsets import DateOffset
# Create target dates: all days in the weeks in the original dataframe
new_index = pd.date_range(start=df['date'].iloc[0],
end=df['date'].iloc[-1] + DateOffset(6),
freq='D')
# Temporarily set dates as index, conform to target dates and forward fill data
# Finally reset the index as in the original df
out = df.set_index('date')\
.reindex(new_index).ffill()\
.reset_index(drop=False)\
.rename(columns = {'index' : 'date'})
这给出了预期的结果:
date value
0 2017-05-01 1.0
1 2017-05-02 1.0
2 2017-05-03 1.0
3 2017-05-04 1.0
4 2017-05-05 1.0
5 2017-05-06 1.0
6 2017-05-07 1.0
7 2017-05-08 4.0
8 2017-05-09 4.0
9 2017-05-10 4.0
10 2017-05-11 4.0
11 2017-05-12 4.0
12 2017-05-13 4.0
13 2017-05-14 4.0
14 2017-05-15 9.0
15 2017-05-16 9.0
16 2017-05-17 9.0
17 2017-05-18 9.0
18 2017-05-19 9.0
19 2017-05-20 9.0
20 2017-05-21 9.0