Question

我有一个名为C的2行数组，如下所示：

from numpy import *
A = [1,2,3,4,5]
B = [50,40,30,20,10]
C = vstack((A,B))

我想获取C中的所有列，其中第一行中的值落在i和i + 2之间，并对它们求平均值。我可以用一个没问题来做到这一点：

i = 0
A_avg = []

while(i<6):
    selection = A[logical_and(A >= i, A < i+2)] 
    A_avg.append(mean(selection))
    i += 2

然后A_avg是：

[1.0,2.5,4.5]

我想用我的双行数组C执行相同的过程，但我想分别取每行的平均值，同时按照第一行的方式进行。例如，对于C，我想得到一个2 x 3数组，看起来像：

[[1.0,2.5,4.5],
 [50,35,15]]

第一行是A，在i和i + 2之间的块中平均为A，而第二行是B在与A相同的块中的平均值，而不管它具有的值。因此，第一个条目保持不变，接下来的两个条目一起平均，接下来的两个条目一起平均，每个行分别进行平均。有人知道一个聪明的方法吗？非常感谢！

Answer 1

我希望这不是太聪明。 TIL布尔索引不播放，所以我不得不手动进行广播。如果有什么不清楚，请告诉我。

import numpy as np
A = [1,2,3,4,5]
B = [50,40,30,20,10]
C = np.vstack((A,B)) # float so that I can use np.nan

i = np.arange(0, 6, 2)[:, None]
selections = np.logical_and(A >= i, A < i+2)[None]

D, selections = np.broadcast_arrays(C[:, None], selections)
D = D.astype(float)     # allows use of nan, and makes a copy to prevent repeated behavior
D[~selections] = np.nan # exclude these elements from mean

D = np.nanmean(D, axis=-1)

然后，

>>> D
array([[  1. ,   2.5,   4.5],
       [ 50. ,  35. ,  15. ]])

另一种方法，使用np.histogram来存储您的数据。对于大型数组，这可能更快，但仅对少数几行有用，因为每个行必须使用不同的权重：

bins = np.arange(0, 7, 2)     # include the end
n = np.histogram(A, bins)[0]  # number of columns in each bin
a_mean = np.histogram(A, bins, weights=A)[0]/n
b_mean = np.histogram(A, bins, weights=B)[0]/n
D = np.vstack([a_mean, b_mean])

在Python中平均多行数组的各个部分

1 个答案: