Question

以下是数据：

days = ['2019-07-03', '2019-07-03',
        '2019-07-04', '2019-07-04',
        '2019-07-07', '2019-07-08',
        '2019-07-08', '2019-07-08']

days_completed = ['2019-07-05', '2019-07-04',
                  '2019-07-05', '2019-07-06',
                  '2019-07-07', '2019-07-08',
                  '2019-07-08', '2019-07-08']

ddf = pd.DataFrame({'Val': [0, 1, 2, 1, 4,1,3,1],'days_completed':days_completed},
                  index= days)

ddf.index = pd.to_datetime(ddf.index)
ddf["days_completed"] = pd.to_datetime(ddf["days_completed"])

           Val    days_completed
2019-07-03  0      2019-07-05
2019-07-03  1      2019-07-04
2019-07-04  2      2019-07-05
2019-07-04  1      2019-07-06
2019-07-07  4      2019-07-07
2019-07-08  1      2019-07-08
2019-07-08  3      2019-07-08
2019-07-08  1      2019-07-08

对于索引上的每个日期，我都希望应用偏移量为1天的滚动总和，但要排除索引时尚未完成的值。像这样：

           Sum(Val)    days_completed
2019-07-03  NaN        2019-07-05
2019-07-03  NaN        2019-07-04
2019-07-04  1          2019-07-05
2019-07-04  1          2019-07-06
2019-07-07  4          2019-07-07
2019-07-08  5          2019-07-08
2019-07-08  8          2019-07-08
2019-07-09  9          2019-07-10

到目前为止，我基本上已经尝试过：

ddf.rolling("1D")["Val","days_completed"].apply(lambda x: np.sum(x[x["days_completed"] <= x.index]["Val"]))

但是返回错误：

IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices

感谢您的帮助！

在日期时间对象上滚动条件

0 个答案: