根据先前的值对numpy代码进行矢量化运算

时间:2018-10-08 15:50:54

标签: python numpy vectorization

以下代码对可以在任何时间采样3个不同状态的系统进行建模,并且这些状态之间的恒定转换概率由矩阵prob_nor给出。因此,trace中的每个点都取决于先前的状态。

n_states, n_frames = 3, 1000
state_val = np.linspace(0, 1, n_states)

prob = np.random.randint(1, 10, size=(n_states,)*2)
prob[np.diag_indices(n_states)] += 50

prob_nor = prob/prob.sum(1)[:,None] # transition probability matrix, 
                                    # row sum normalized to 1.0

state_idx = range(n_states) # states is a list of integers 0, 1, 2...
current_state = np.random.choice(state_idx)

trace = []      
sigma = 0.1     
for _ in range(n_frames):
    trace.append(np.random.normal(loc=state_val[current_state], scale=sigma))
    current_state = np.random.choice(state_idx, p=prob_nor[current_state, :])

以上代码中的循环使其运行非常慢,尤其是当我必须对数百万个数据点进行建模时。有什么方法可以向量化/加速它?

2 个答案:

答案 0 :(得分:3)

尽快卸载概率计算:

possible_paths = np.vstack(
    np.random.choice(state_idx, p=prob_nor[curr_state, :], size=n_frames)
    for curr_state in range(n_states)
)

然后,您可以简单地按照路径进行查找:

path_trace = [None]*n_frames
for step in range(n_frames):
    path_trace[step] = possible_paths[current_state, step]
    current_state = possible_paths[current_state, step]

一旦有了路径,就可以计算轨迹:

sigma = 0.1
trace = np.random.normal(loc=state_val[path_trace], scale=sigma, size=n_frames)

比较时间:

纯python for循环

%%timeit
trace_list = []
current_state = np.random.choice(state_idx)
for _ in range(n_frames):
    trace_list.append(np.random.normal(loc=state_val[current_state], scale=sigma))
    current_state = np.random.choice(state_idx, p=prob_nor[current_state, :])

结果:

30.1 ms ± 436 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

矢量化查找

%%timeit
current_state = np.random.choice(state_idx)
path_trace = [None]*n_frames
possible_paths = np.vstack(
    np.random.choice(state_idx, p=prob_nor[curr_state, :], size=n_frames)
    for curr_state in range(n_states)
)
for step in range(n_frames):
    path_trace[step] = possible_paths[current_state, step]
    current_state = possible_paths[current_state, step]
trace = np.random.normal(loc=state_val[path_trace], scale=sigma, size=n_frames)

结果:

641 µs ± 6.03 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

加速约50倍。

答案 1 :(得分:2)

也许我遗漏了一些东西,但是我认为您可以将current_state创建为列表,然后对其余步骤进行矢量化处理:

# Make list of states (slow part)
states = []
current_state = np.random.choice(state_idx)
for _ in range(n_frames):
    states.append(current_state)
    current_state = np.random.choice(state_idx, p=prob_nor[current_state, :])

# Vectorised part
state_vals = state_val[states]   # alternatively np.array(states) / (n_states - 1)
trace = np.random.normal(loc=states, scale=sigma)

我相信此方法有效,并且会在使用一些额外内存的同时适度提高速度(创建3个列表/数组而不是一个)。 @PMende的解决方案可以大大提高速度。