Question

我在为dim-2 numpy数组的每一列执行逐列操作时遇到问题。我正在努力使我的案例适应this answer，尽管我的设置不同。我的实际数据集非常大，涉及多个重新取样，因此下面的示例的语法。如果代码和说明看起来太长，请考虑跳至标题相关。

可跳过（仅限于此处重现zs）

考虑一个（x_n, y_n）数据集n = 0, 1, or 2。

def get_xy(num, size=10):
    ## (x1, y1), (x2, y2), (x3, y3) where xi, yi are both arrays
    if num == 0:
        x = np.linspace(7, size+6, size)
        y = np.linspace(3, size+2, size)
    elif num == 1:
        x = np.linspace(5, size+4, size)
        y = np.linspace(2, size+1, size)
    elif num == 2:
        x = np.linspace(4, size+3, size)
        y = np.linspace(1, size, size)
    return x, y

假设我们可以计算给定数组z_n和x_n的某些指标y_n。

def get_single_z(x, y, constant=2):
    deltas = [x[i] - y[i] for i in range(len(x)) if len(x) == len(y)]
    return constant * np.array(deltas)

我们不是单独计算每个z_n，而是一次计算所有z_n。

def get_all_z(constant=2):
    zs = []
    for num in range(3): ## 0, 1, 2
        xs, ys = get_xy(num)
        zs.append(get_single_z(xs, ys, constant))
    zs = np.array(zs)
    return zs

有关：

zs = get_all_z()
print(zs)
>> [[ 8.  8.  8.  8.  8.  8.  8.  8.  8.  8.]
    [ 6.  6.  6.  6.  6.  6.  6.  6.  6.  6.]
    [ 6.  6.  6.  6.  6.  6.  6.  6.  6.  6.]]

出于我的目的，我想创建一个新列表或数组vs，其中每个索引的值等于zs的相应列中的值的平均值。对于这种情况，vs的每个元素都是相同的（因为每个操作都是[8,6,6]的平均值）。但如果第一个子阵列的第一个元素是10而不是8，那么vs的第一个元素将是[10,6,6]的平均值。

尝试失败：

def get_avg_per_col(z):
    ## column ?= axis number
    return [np.mean(z, axis=i) for i in range(len(zs[0]))]

print(get_avg_per_col(zs))
Traceback (most recent call last):...
...line 50, in _count_reduce_items ## of numpy code, not my code
    items *= arr.shape[ax]
IndexError: tuple index out of range

Answer 1

您可以在转置后的np.mean上使用zs来获得列方式。

In [49]: import numpy as np

In [53]: zs = np.array([[ 8.,  8.,  8.,  8.,  8.,  8.,  8.,  8.,  8.,  8.],
    ...:  [ 6.,  6.,  6.,  6.,  6.,  6.,  6.,  6.,  6.,  6.],
    ...:  [ 6.,  6.,  6.,  6.,  6.,  6.,  6.,  6.,  6.,  6.]])

In [54]: np.mean(zs.T, axis=1)
Out[54]: 
array([ 6.66666667,  6.66666667,  6.66666667,  6.66666667,  6.66666667,
        6.66666667,  6.66666667,  6.66666667,  6.66666667,  6.66666667])

如何获得嵌套numpy数组的每个相应列的所有平均值？

1 个答案: