重新采样数据和计数样本

时间:2015-07-31 14:00:08

标签: python python-3.x pandas

我在使用此功能时遇到了一些麻烦。输出对我来说没有意义,也许还有更好的方法吗?

def resample1min(df):
    # resamples into 1 min bins (post-marked), counts the number of available values within the 1 min period
    # this is due to the way the LiDAR and mast data is stored        

    df_result_1minavg = df.resample('1min', how='mean', closed='right', label='right')

    avail = df.groupby(pd.TimeGrouper('1min', closed='right', label='right'))['hspeed'].apply(lambda x: (~np.isnan(x)).count())
    avail.name = "avail"
    df_result_1minavg = pd.concat([df_result_1minavg, avail], axis=1)

    return df_result_1minavg    

基本上,我需要将原始数据(每24秒)重新采样到1分钟的箱中,并计算箱内有多少(有效)hspeed值。到目前为止,重新取样工作正常但不是“有效”。计算。

例如,这是一些原始输出:

30/04/2014 18:15    6.142705482 -4.973648083    7.938508677 308.9965255
30/04/2014 18:15    5.515934842 -5.22163357 7.628816391 313.4299983
30/04/2014 18:15    5.820077951 -5.336002619    7.930640318 312.5154279
30/04/2014 18:16    6.108619196 -5.495682719    8.253005862 311.9764602
30/04/2014 18:16    5.815988388 -5.421155709    7.985682142 312.987661
30/04/2014 18:17    5.951908553 -5.299767675    8.004489122 311.6828797
30/04/2014 18:17    6.007600178 -5.284258473    8.036053689 311.3347153
30/04/2014 18:17    5.879259541 -5.345762306    7.981146823 312.2789224  

一切看起来都很棒!但是在1分钟的重采样输出中它表示在18:16内有6个样本!这可能是正确的,因为每次扫描是24秒。什么出错了? 在此先感谢:)

0 个答案:

没有答案