如何优化以下熊猫功能?

时间:2019-03-27 09:23:52

标签: python pandas

我正在尝试优化下面的例程,该例程每帧执行一次,它计算物体的行进距离,时间和速度delta t。

输入只是一个具有x_world,y_world坐标和帧计数器(帧)的数据框。

首先,它求差并计算速度ds

def compute_s_t(df,
                    gb=('session_time', 'trajectory_id'),
                    params=('t', 's', 's_normalized', 'v_direct', 't_abs', ),
                    fps=25, inplace=False):


        _df = df.copy()

        orig_columns = _df.columns.values.tolist()
        # compute travelled distance

        _df['dx'] = _df['x_world'].diff()
        _df['dy'] = _df['y_world'].diff()

        _df['ds'] = np.sqrt(np.array(_df['dx'] ** 2 + _df['dy'] ** 2, dtype=np.float32))

        _df['ds'].at[0] = 0  # to avoid NaN returned by .diff()
        _df['ds'] = _df['ds'][~_df['ds'].index.duplicated()]
        _df['s'] = _df['ds'].cumsum()
        _df['s'].at[0] = 0
       # _df['s'] = (_df.groupby('trajectory_id')['s']
        #             .transform(subtract_nanmin))

        # compute travelled time
        _df['dt'] = _df['frame'].diff() / fps
        _df['dt'].at[0] = 0  # to avoid NaN returned by .diff()

        _df['t'] = _df['dt'].cumsum()
        _df['t'].at[0] = 0
        #_df['t'] = (_df.groupby('trajectory_id')['t']
          #           .transform(subtract_nanmin))

        _df['t_abs'] = _df['frame'] / fps
        _df['t_abs'].at[0] = 0
        # compute velocity
        # why values[:, 0]? why duplicate column?
        _df['v_direct'] = _df['ds'].values / _df['dt'].values
        _df.at[_df['t'] == 0, 'v'] = 0

        # compute normalized s
        _df['s_normalized'] = (_df.groupby('trajectory_id')['s']
                                .transform(divide_nanmax))

        # skip intermediate results
        cols = orig_columns + list(params)



        return _df[cols]

该功能大约需要50毫秒。

x_world 120
y_world 320
trajector_id 1
frame is a counter 2
def divide_nanmax(x):
    return np.nanmax(x) / x

0 个答案:

没有答案