Question

我有一个包含两列的数据框，如下所示：

         date        value
0     2017-05-01       1
1     2017-05-08       4
2     2017-05-15       9

每一行显示一周的星期一，我只有该特定日期的值。我想估计直到下周一的整个工作日的这个值，并获得以下输出：

            date        value
0      2017-05-01       1
1      2017-05-02       1
2      2017-05-03       1
3      2017-05-04       1
4      2017-05-05       1
5      2017-05-06       1
6      2017-05-07       1
7      2017-05-08       4
8      2017-05-09       4
9      2017-05-10       4
10     2017-05-11       4
11     2017-05-12       4
12     2017-05-13       4
13     2017-05-14       4
14     2017-05-15       9
15     2017-05-16       9
16     2017-05-17       9
17     2017-05-18       9
18     2017-05-19       9
19     2017-05-20       9
20     2017-05-21       9

在this link中，它显示了如何在Dataframe中选择范围，但我不知道如何解释value列。

Answer 1

以下是使用pandas reindex和ffill的解决方案：

# Make sure dates is treated as datetime 
df['date'] = pd.to_datetime(df['date'], format = "%Y-%m-%d")

from pandas.tseries.offsets import DateOffset

# Create target dates: all days in the weeks in the original dataframe
new_index = pd.date_range(start=df['date'].iloc[0],
                          end=df['date'].iloc[-1] + DateOffset(6),
                          freq='D')

# Temporarily set dates as index, conform to target dates and forward fill data
# Finally reset the index as in the original df  
out = df.set_index('date')\
        .reindex(new_index).ffill()\
        .reset_index(drop=False)\
        .rename(columns = {'index' : 'date'})

这给出了预期的结果：

         date  value
0  2017-05-01    1.0
1  2017-05-02    1.0
2  2017-05-03    1.0
3  2017-05-04    1.0
4  2017-05-05    1.0
5  2017-05-06    1.0
6  2017-05-07    1.0
7  2017-05-08    4.0
8  2017-05-09    4.0
9  2017-05-10    4.0
10 2017-05-11    4.0
11 2017-05-12    4.0
12 2017-05-13    4.0
13 2017-05-14    4.0
14 2017-05-15    9.0
15 2017-05-16    9.0
16 2017-05-17    9.0
17 2017-05-18    9.0
18 2017-05-19    9.0
19 2017-05-20    9.0
20 2017-05-21    9.0

将一周中第一天的值投影到Pandas中的整周

1 个答案: