计算熊猫中的连续条件

时间:2017-12-14 23:12:13

标签: python pandas

我试图计算一个条件正在发生的次数。 我读到了groupby.cumcount(),但它并没有按照我想要的方式运作。 以下是数据的一小部分:

                  min        max
time                            
1970-01-02  -3.440000  -1.180000
1970-01-03  -4.830000  -0.700000
1970-01-04  -6.250000   0.250000
1970-01-05 -11.700000  -6.690000
1970-01-06 -13.000000  -3.720000
1970-01-07  -3.870000   2.070000
1970-01-08   0.320000   2.690000
1970-01-09  -5.170000   2.310000
1970-01-10  -4.540000   1.140000
1970-01-11  -7.260000   1.300000
1970-01-12  -9.870000  -0.780000
1970-01-13  -6.520000  -0.390000
1970-01-14  -8.490000  -5.090000
1970-01-15 -13.670000  -8.670000
1970-01-16 -11.080000  -4.110000
1970-01-17 -24.770000  -7.320000
1970-01-18 -29.709999 -24.230000
1970-01-19 -24.200001 -19.480000
1970-01-20 -31.000000 -13.810000
1970-01-21 -36.389999 -30.209999
1970-01-22 -39.889999 -36.990002
1970-01-23 -41.750000 -38.730000
1970-01-24 -38.259998  -8.510000
1970-01-25 -14.100000  -5.740000
1970-01-26 -12.000000  -8.540000
1970-01-27 -12.060000  -7.470000
1970-01-28 -10.230000  -7.710000
1970-01-29 -10.850000  -8.400000
1970-01-30 -15.270000  -9.870000
1970-01-31 -11.920000  -5.290000

考虑条件:df['min'] <= -30以及 3天的优势或期限。

我想知道我们有多少次连续三天以'-30'的年龄低于-30岁。

所以结果就像(虚拟值):

       occurences
time   
1970   3
1971   4
1972   2
1973   3

我玩过一些解决方案,但我无法接近,有什么建议吗?

1 个答案:

答案 0 :(得分:3)

IIUC:

In [94]: x = df[df.rolling(3)['min'].max() <= -30]

In [95]: x.groupby(x.index.year)['min'].count().to_frame('occurences')
Out[95]:
      occurences
1970           3