如何使用numpy在for循环中向量化矩阵和?

时间:2014-08-21 22:13:45

标签: python numpy scipy vectorization broadcast

基本上我有一个矩阵,其行= 3600,列= 5,并希望将其下采样到60行的包裹:

import numpy as np

X = np.random.rand(3600,5)

down_sample = 60
ds_rng = range(0,X.shape[0],down_sample)
X_ds = np.zeros((ds_rng.__len__(),X.shape[1]))

i = 0
for j in ds_rng:
    X_ds[i,:] = np.sum( X[j:j+down_sample,:], axis=0 )
    i += 1

3 个答案:

答案 0 :(得分:3)

另一种方法可能是:

def blockwise_sum(X, down_sample=60):
    n, m = X.shape

    ds_n = n / down_sample
    N = ds_n * down_sample

    if N == n:
        return np.sum(X.reshape(-1, down_sample, m), axis=1)

    X_ds = np.zeros((ds_n + 1, m))
    X_ds[:ds_n] = np.sum(X[:N].reshape(-1, down_sample, m), axis=1)
    X_ds[-1] = np.sum(X[N:], axis=0)

    return X_ds

我不知道它是否会更快。

答案 1 :(得分:2)

至少在这种情况下,einsumsum快。

np.einsum('ijk->ik',x.reshape(-1,down_sample,x.shape[1]))

blockwise_sum快2倍。

我的时间:

OP iterative  - 1.59 ms
with strided  -   198 us
blockwise_sum -   179 us
einsum        -    76 us

答案 2 :(得分:1)

看起来你可以使用一些大步技巧来完成工作。

以下是我们需要的设置代码:

import numpy as np
X = np.random.rand(1000,5)
down_sample = 60

现在,我们愚蠢地认为X被分成了包裹:

num_parcels = int(np.ceil(X.shape[0] / float(down_sample)))
X_view = np.lib.stride_tricks.as_strided(X, shape=(num_parcels,down_sample,X.shape[1]))

X_ds = X_view.sum(axis=1)  # sum over the down_sample axis

最后,如果您的下采样间隔不能均匀地划分您的行,您需要修复X_ds中的最后一行,因为我们拉动的步幅使其回绕

rem = X.shape[0] % down_sample
if rem != 0:
  X_ds[-1] = X[-rem:].sum(axis=0)