重采样熊猫系列

时间:2020-05-21 20:14:24

标签: pandas

我有一个简单的时间序列

2014-11-17 05:00:00+00:00  1.30367
2014-11-17 05:01:00+00:00  1.30352
2014-11-17 05:02:00+00:00  1.30382
2014-11-17 05:03:00+00:00  1.30373
2014-11-17 05:04:00+00:00  1.30425
2014-11-17 05:05:00+00:00  1.30399
2014-11-17 05:06:00+00:00  1.30378

我想使用“ 2min”进行重新采样,例如理想情况下,我想得到

2014-11-17 05:01:00+00:00  1.30352
2014-11-17 05:03:00+00:00  1.30373
2014-11-17 05:05:00+00:00  1.30399
2014-11-17 05:07:00+00:00  1.30378

内置的重采样功能给了我

2014-11-17 05:00:00+00:00    1.30367
2014-11-17 05:02:00+00:00    1.30382
2014-11-17 05:04:00+00:00    1.30425
2014-11-17 05:06:00+00:00    1.30378

我正在使用series.resample(rule =“ 2min”,label =“ right”,closed =“ right”)。last() 我特别对第一点感到困惑。

非常感谢

1 个答案:

答案 0 :(得分:2)

您不想使用label='right',而是使用loffset

from io import StringIO
import pandas


data = StringIO("""\
2014-11-17 05:00:00+00:00,1.30367
2014-11-17 05:01:00+00:00,1.30352
2014-11-17 05:02:00+00:00,1.30382
2014-11-17 05:03:00+00:00,1.30373
2014-11-17 05:04:00+00:00,1.30425
2014-11-17 05:05:00+00:00,1.30399
2014-11-17 05:06:00+00:00,1.30378
""")

window = pandas.offsets.Minute(2)

df = (
    pandas.read_csv(data, parse_dates=[0], header=None, names=['dt', 'value'])
        .set_index(['dt'])
        .resample(window, loffset=window/2)
        .last()
)

那给了我

                             value
dt                                
2014-11-17 05:01:00+00:00  1.30352
2014-11-17 05:03:00+00:00  1.30373
2014-11-17 05:05:00+00:00  1.30399
2014-11-17 05:07:00+00:00  1.30378