numpy数组切片以避免for循环

时间:2015-04-02 14:31:34

标签: python numpy

我正在使用numpy进行一些计算。在以下代码中:

    assert(len(A.shape) == 2)  # A is a 2D nparray
    d1, d2 = A.shape
    # want to initial G,which has the same dimension as A. And assign the last column of A to the last column of G 
    # initial with value 0   
    G = zero_likes(A)
    # assign the last column to that of G
    G[:, d2-1] = A[:, d2-1]

    # the columns[0,dw-1] of G is the average of columns [0, dw-1] of A, based on the condition of B
    for iW in range(d2-1):
        n = 0
        sum = 0.0
        for i in range(d1):
            if B[i, 0] != iW and B[i, 1] == 0:
                sum += A[i, iW]
                n += 1
        for i in range(d1):
            if B[i, 0] != iW and B[i, 1] == 0:
                G[i, iW] = sum / (1.0 * n)
    return G

使用“切片”或“布尔数组”是否有更简单的方法?

谢谢!

2 个答案:

答案 0 :(得分:0)

如果您希望G具有与A相同的维度,然后更改G的相应元素,则以下代码应该有效:

# create G as a copy of A, otherwise you might change A by changing G
G = A.copy()

# getting the mask for all columns except the last one
m = (B[:,0][:,None] != np.arange(d2-1)[None,:]) & (B[:,1]==0)[:,None]

# getting a matrix with those elements of A which fulfills the conditions
C = np.where(m,A[:,:d2-1],0).astype(np.float)

# get the 'modified' average you use
avg = np.sum(C,axis=0)/np.sum(m.astype(np.int),axis=0)

# change the appropriate elements in all the columns except the last one
G[:,:-1] = np.where(m,avg,A[:,:d2-1])

在摆弄了很长时间并发现错误之后......我最终得到了这段代码。我针对几个随机矩阵AB

的特定选项进行了检查
A = numpy.random.randint(100,size=(5,10))
B = np.column_stack(([4,2,1,3,4],np.zeros(5)))
到目前为止,你和我的结果是一致的。

答案 1 :(得分:0)

这是一个开始,专注于第一个内循环:

In [35]: A=np.arange(12).reshape(3,4)

In [36]: B=np.array([[0,0],[1,0],[2,0]])

In [37]: sum=0

In [38]: for i in range(3):
    if B[i,0]!=iW and B[i,1]==0:
        sum += A[i,iW]
        print(i,A[i,iW])
   ....:         
1 4
2 8

In [39]: A[(B[:,0]!=iW)&(B[:,1]==0),iW].sum()
Out[39]: 12

我必须提供自己的样本数据来测试它。

第二个循环具有相同的条件(B[:,0]!=iW)&(B[:,1]==0),并且应该以相同的方式工作。

正如其中一条评论所说,G的维度看起来很有趣。为了使我的样本工作,让我们制作零数组。您似乎正在分配G的所选元素,即Asum/n)的子集的平均值

In [52]: G=np.zeros_like(A)
In [53]: G[I,iW]=A[I,iW].mean()

假设n,每个iW求和的项数变化,可能难以将外循环压缩为矢量化步骤。如果n相同,则可以提取符合条件的A子集,例如A1,取一个轴上的均值,将值分配给{{1} }。在总和中使用不同数量的术语,您仍然需要循环。

我突然想到蒙面数组可能会起作用。掩盖不符合条件的G条款,然后取平均值。

A

或使用In [91]: I=(B[:,[0]]!=np.arange(4))&(B[:,[1]]==0) In [92]: I Out[92]: array([[False, True, True, True], [ True, False, True, True], [ True, True, False, True]], dtype=bool) In [93]: A1=np.ma.masked_array(A, ~I) In [94]: A1 Out[94]: masked_array(data = [[-- 1 2 3] [4 -- 6 7] [8 9 -- 11]], mask = [[ True False False False] [False True False False] [False False True False]], fill_value = 999999) In [95]: A1.mean(0) Out[95]: masked_array(data = [6.0 5.0 4.0 7.0], mask = [False False False False], fill_value = 1e+20) plonser's

where