Python Pandas重采样时间序列函数

时间:2018-01-25 06:11:48

标签: python function pandas datetime time-series

我希望我的函数重新采样传递到各种频率的pandas系列 - 样本,我觉得我几乎在那里,除了它似乎保留旧索引而不是创建重新采样索引并产生大量NaN值:

index=pd.date_range('2015-10-1 00:00:00', '2018-12-31 23:50:00', freq='30min')
df=pd.DataFrame(np.random.randn(len(index),2).cumsum(axis=0),columns=['A','B'],index=index)


def resample(ts):
    samples = ['60m','4h','D','1h','W']
    counter = 0
    resampled = {}
    while counter < len(samples):
        for i in samples:
            ts = ts.resample(i).mean()
            resampled[i]=ts
            counter+=1
    return resampled


data = resample(df.A)

data['W']

2015-11-01     21.396793
2015-11-08           NaN
2015-11-15           NaN
2015-11-22           NaN

所以基本上我想要5个新的重采样时间序列数组。

感谢。

2 个答案:

答案 0 :(得分:0)

我认为您需要对数据进行上传,因此需要将mean更改为ffillbfill个函数,60T还需要60 minutes

def resample(ts):
    samples = ['60T','4h','D','1h','W']
    resampled = {}
    for i in samples:
        ts = ts.resample(i).ffill()
        resampled[i]=ts
    return resampled

答案 1 :(得分:0)

index=pd.date_range('2015-10-1 00:00:00', '2018-12-31 23:50:00', freq='30min')
df=pd.DataFrame(np.random.randn(len(index),2).cumsum(axis=0),columns=['A','B'],index=index)

你的其余代码基本上是无关紧要的:

data = {freq: df['A'].resample(freq).mean() for freq in ['60m','4h','D','1h','W']}

data现在有5个元素,其中每个元素都是重新采样的DataFrame。