Question

作为能够在两个日期时间之间计算活动的后续问题，这里非常好：Create a Pandas dataframe with counts of items spanning a date range

剩下的问题是最后一个月，['END_DATE']在两个表被求和相减后最终显示为零，这在数学上是正确的，因为所有项目都有当前月份或更早的结束日期，但是这种情况因为它们在那个月至少在某个部分处于活动状态，所以将一个月添加到END_DATE会更加正确，因此它们将在结束月份显示为活动状态（H2是数据帧）

代码是：

ends = H2['END_DATE'].apply(lambda t: t.to_period(freq='m')).value_counts()

我尝试使用rollforward和DateOffset（month = 1），例如。对于DateOffset：

ends = (H2['END_DATE'].DateOffset(months=1)).apply(lambda t: t.to_period(freq='m')).value_counts()

这给了我这个错误：

AttributeError: 'Series' object has no attribute 'DateOffset'

Answer 1

最简单的方法是在PeriodIndex中添加一个（月）：

In [21]: ends
Out[21]:
2000-05    1
2000-09    1
2001-06    1
Freq: M, dtype: int64

In [22]: ends.index = ends.index + 1

In [23]: ends
Out[23]:
2000-06    1
2000-10    1
2001-07    1
Freq: M, dtype: int64

我最初的建议是在重新编制索引后进行转换（因为无论如何你都要这样做）：

In [11]: ends
Out[11]:
2000-05    1
2000-09    1
2001-06    1
Freq: M, dtype: int64

In [12]: p = pd.PeriodIndex(freq='m', start='2000-1', periods=19)  # Note: needs to be one more than before

In [13]: sparse_ends = ends.reindex(p)

In [14]: sparse_ends.shift(1)
Out[14]:
2000-01   NaN
2000-02   NaN
2000-03   NaN
2000-04   NaN
2000-05   NaN
2000-06     1
2000-07   NaN
2000-08   NaN
2000-09   NaN
2000-10     1
2000-11   NaN
2000-12   NaN
2001-01   NaN
2001-02   NaN
2001-03   NaN
2001-04   NaN
2001-05   NaN
2001-06   NaN
2001-07     1
Freq: M, dtype: float64

Pandas，简单地将一个月添加到数据框中的datetime列

1 个答案: