优化时间序列生成

时间:2015-09-21 17:13:37

标签: python pandas

我有一个时间序列,其中valX的任何值tX都有两个与之关联的值(minXmaxX)。如下图所示,这些值始终满足minX < valX < maxX

现在,我想创建一个新的时间序列,为每个tX关联minXmaxXtY > tXimport pandas import numpy as np # An example data frame np.random.seed(1) df = pandas.DataFrame(np.random.rand(10, 3), columns=['min', 'max', 'val']) df['max'] += 1 df['val'] = (df['min'] + df['max']) / 2. # An auxiliary column, that will be shifted df['shift'] = df['val'].copy() # This is the time series I am looking for (initialized with NaN values) df['result'] = np.nan # Main loop LIMIT = len(df) for i in range(LIMIT): df['shift'] = df['shift'].shift(-1) df['result'].update(df['shift'][((df['shift'] < df['min']) | \ (df['shift'] > df['max'])) & \ (df['result'].isnull())]) # Data frame is well-formed df 的时间序列中的第一个值}:

enter image description here

这是我提出的实施方案:

$('button').mouseenter(function (e) {
    var data = $(this).data('value');
    if(data){
        $('<div />', {
                    'class' : 'tip',
                    text : $(this).data('value'),
                    css : {
                        position: 'fixed',
                        top: e.pageY-230,
                        left: e.pageX+15
                    }
                }).appendTo(this);
}
})
.mouseleave(function () {                                              
   $('.tip', this).remove();                  
})
.mousemove(function (e) {      
    $('.tip', this).css({
        top: e.pageY-230,
        left: e.pageX+15
    });                 
})

显示正确的结果:

enter image description here

我想知道是否有更好的(特别是执行速度更快)的方式。

1 个答案:

答案 0 :(得分:3)

numba通常适用于这类问题。您还可以使用带有更多注释的cython获得类似的结果。

@numba.jit(nopython=True)
def generate_values(mins, maxs, vals):
    N = len(vals)
    ans = np.empty(N)

    for i in range(N):
        for j in range(i, N):
            if vals[j] < mins[i] or vals[j] > maxs[i]:
                ans[i] = vals[j]
                break
        else:
            ans[i] = np.nan
    return ans

有点冗长,但非常快。

In [278]: %%time
     ...: LIMIT = len(df)
     ...: for i in range(LIMIT):
     ...:     df['shift'] = df['shift'].shift(-1)
     ...:     df['result'].update(df['shift'][((df['shift'] < df['min']) | \
     ...:                                      (df['shift'] > df['max'])) & \
     ...:                                     (df['result'].isnull())])
Wall time: 62 ms


In [281]: %timeit generate_values(df['min'].values, df['max'].values, df['val'].values)
10000 loops, best of 3: 20.6 µs per loop