在python

时间:2018-03-27 10:07:00

标签: python pandas

我有这个df

      dateTime                  1min        hour minute X   EXPECTED Rolling_X
2017-09-19  02:00:04    2017-09-19  02:00:00    2   0   5   5
2017-09-19  02:00:04    2017-09-19  02:00:00    2   0   1   6
2017-09-19  02:00:04    2017-09-19  02:00:00    2   0   1   7
2017-09-19  02:00:22    2017-09-19  02:00:00    2   0   2   9
2017-09-19  02:01:31    2017-09-19  02:01:00    2   1   0   9
2017-09-19  02:01:31    2017-09-19  02:01:00    2   1   1   10
2017-09-19  02:01:32    2017-09-19  02:01:00    2   1   1   11
2017-09-19  02:01:34    2017-09-19  02:01:00    2   1   6   17
2017-09-19  02:01:35    2017-09-19  02:01:00    2   1   5   22
2017-09-19  02:01:35    2017-09-19  02:01:00    2   1   0   22
2017-09-19  02:01:39    2017-09-19  02:01:00    2   1   1   23
2017-09-19  02:01:58    2017-09-19  02:01:00    2   1   2   25
2017-09-19  02:01:58    2017-09-19  02:01:00    2   1   0   25
2017-09-19  02:02:02    2017-09-19  02:02:00    2   2   3   19
2017-09-19  02:02:32    2017-09-19  02:02:00    2   2   0   19
2017-09-19  02:02:32    2017-09-19  02:02:00    2   2   1   20
2017-09-19  02:02:40    2017-09-19  02:02:00    2   2   15  35
2017-09-19  02:02:41    2017-09-19  02:02:00    2   2   6   41
2017-09-19  02:02:44    2017-09-19  02:02:00    2   2   1   42
2017-09-19  02:02:53    2017-09-19  02:02:00    2   2   3   45
2017-09-19  02:03:00    2017-09-19  02:03:00    2   3   1   30
2017-09-19  02:03:00    2017-09-19  02:03:00    2   3   1   31
2017-09-19  02:03:05    2017-09-19  02:03:00    2   3   1   32
2017-09-19  02:04:07    2017-09-19  02:04:00    2   4   7   10
2017-09-19  02:04:58    2017-09-19  02:04:00    2   4   2   12
2017-09-19  02:05:22    2017-09-19  02:05:00    2   5   11  23
2017-09-19  02:05:36    2017-09-19  02:05:00    2   5   2   25

我正在寻找继续滚动总和,这取决于会议记录。

我需要在最后2分钟继续滚动。因此,每当我得到X的总和,但是当日期时间增加1分钟时,它会减少前一个尾部1分钟,所以我将只持续2分钟。 问题是每分钟没有相同数量的数据。

我试过你用过:

s2m = df['dateTime'].dt.floor('2T').diff().shift(-1).eq(pd.Timedelta('2 minutes'))
s2m1 = df['X'].cumsum()
df['truncate_2m'] = s2m.mul(s2m1).diff().where(lambda x: x < 0).ffill().add(s2m1, fill_value=0)

但它每2分钟休息一次,而不是继续滚动。

感谢您的帮助!

1 个答案:

答案 0 :(得分:1)

好的,你现在需要滚动。

df = df.set_index('dateTime')
df['Roll_X'] = df.rolling('2T')['X'].sum()
df

输出:

                                    1min  hour  minute   X  EXPECTED Rolling_X  Roll_X
dateTime                                                                              
2017-09-19 02:00:04  2017-09-19 02:00:00     2       0   5                   5     5.0
2017-09-19 02:00:04  2017-09-19 02:00:00     2       0   1                   6     6.0
2017-09-19 02:00:04  2017-09-19 02:00:00     2       0   1                   7     7.0
2017-09-19 02:00:22  2017-09-19 02:00:00     2       0   2                   9     9.0
2017-09-19 02:01:31  2017-09-19 02:01:00     2       1   0                   9     9.0
2017-09-19 02:01:31  2017-09-19 02:01:00     2       1   1                  10    10.0
2017-09-19 02:01:32  2017-09-19 02:01:00     2       1   1                  11    11.0
2017-09-19 02:01:34  2017-09-19 02:01:00     2       1   6                  17    17.0
2017-09-19 02:01:35  2017-09-19 02:01:00     2       1   5                  22    22.0
2017-09-19 02:01:35  2017-09-19 02:01:00     2       1   0                  22    22.0
2017-09-19 02:01:39  2017-09-19 02:01:00     2       1   1                  23    23.0
2017-09-19 02:01:58  2017-09-19 02:01:00     2       1   2                  25    25.0
2017-09-19 02:01:58  2017-09-19 02:01:00     2       1   0                  25    25.0
2017-09-19 02:02:02  2017-09-19 02:02:00     2       2   3                  19    28.0
2017-09-19 02:02:32  2017-09-19 02:02:00     2       2   0                  19    19.0
2017-09-19 02:02:32  2017-09-19 02:02:00     2       2   1                  20    20.0
2017-09-19 02:02:40  2017-09-19 02:02:00     2       2  15                  35    35.0
2017-09-19 02:02:41  2017-09-19 02:02:00     2       2   6                  41    41.0
2017-09-19 02:02:44  2017-09-19 02:02:00     2       2   1                  42    42.0
2017-09-19 02:02:53  2017-09-19 02:02:00     2       2   3                  45    45.0
2017-09-19 02:03:00  2017-09-19 02:03:00     2       3   1                  30    46.0
2017-09-19 02:03:00  2017-09-19 02:03:00     2       3   1                  31    47.0
2017-09-19 02:03:05  2017-09-19 02:03:00     2       3   1                  32    48.0
2017-09-19 02:04:07  2017-09-19 02:04:00     2       4   7                  10    36.0
2017-09-19 02:04:58  2017-09-19 02:04:00     2       4   2                  12    12.0
2017-09-19 02:05:22  2017-09-19 02:05:00     2       5  11                  23    20.0
2017-09-19 02:05:36  2017-09-19 02:05:00     2       5   2                  25    22.0

检查2:03左右,值不同。当我得到46时,你是如何计算30的?