熊猫查找间隔(n天)并捕获开始/结束日期

时间:2020-06-11 10:40:55

标签: python-3.x pandas numpy

这开始于活动列表。我首先建立了一个类似于下面的矩阵来表示所有活动,然后我反转以显示所有不活动状态,然后再构建以下矩阵,其中零表示活动,大于零的值表示下一个活动的天数。

+------+------------+------------+------------+------------+------------+------------+------------+------------+------------+
| Item | 01/08/2020 | 02/08/2020 | 03/08/2020 | 04/08/2020 | 05/08/2020 | 06/08/2020 | 07/08/2020 | 08/08/2020 | 09/08/2020 |
+------+------------+------------+------------+------------+------------+------------+------------+------------+------------+
| A    |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |          0 |
| B    |          3 |          2 |          1 |          0 |          0 |          3 |          2 |          1 |          0 |
| C    |          0 |          2 |          1 |          0 |          1 |          0 |          0 |          0 |          0 |
| D    |          7 |          6 |          5 |          4 |          3 |          2 |          1 |          0 |          0 |
| E    |         11 |         10 |          9 |          8 |          7 |          6 |          5 |          4 |          3 |
+------+------------+------------+------------+------------+------------+------------+------------+------------+------------+

现在我需要为每个项目找到合适的间隔。例如,在这种情况下,我想查找所有持续时间至少为3天的间隔。

+------+------------+------------+------------+------------+
| Item |  1_START   |   1_END    |  2_START   |   2_END    |
+------+------------+------------+------------+------------+
| A    | NaN        | NaN        | NaN        | NaN        |
| B    | 01/08/2020 | 03/08/2020 | 06/08/2020 | 08/08/2020 |
| C    | NaN        | NaN        | NaN        | NaN        |
| D    | 01/08/2020 | 07/08/2020 | NaN        | NaN        |
| E    | 01/08/2020 | NaN        | NaN        | NaN        |
+------+------------+------------+------------+------------+

实际上,数据的宽度为700+列,而行则为1000+。如何有效地做到这一点?

0 个答案:

没有答案
相关问题