Question

我是Python和numpy的新手，所以我只是运行示例代码并尝试调整它们以便理解。我遇到了一些关于numpy.sum的代码，其中包含axis参数，但我无法运行它。过了一段时间（阅读scipy文档，尝试实验），我使用axis = (1,2,3)代替axis = 1让它运行。

问题是，在我搜索的任何地方，他们只会写axis = 1来使其发挥作用。

我使用的是Python 3.5.3，numpy 1.12.1 是否存在numpy / python版本，其行为有很大差异？或者我只是以某种方式将其配置错误了？

import numpy as np
from past.builtins import xrange


# sample data
X = np.arange(1, 4*4*3*5+1).reshape(5, 4, 4, 3)
Y = np.arange(5, 4*4*3*8+5).reshape(8, 4, 4, 3)
Xlen = X.shape[0]
Ylen = Y.shape[0]

# allocate some space for whatever calculation
rs = np.zeros((Xlen, Ylen))
rs1 = np.zeros((Xlen, Ylen))

# calculate the result with 2 loops
for i in xrange(Xlen):
    for j in xrange(Ylen):
        rs[i, j] = np.sum(X[i] + Y[j])

# calculate the result with one loop only
for i in xrange(Xlen):
    rs1[i, :] = np.sum(Y + X[i], axis=(1,2,3))

print(rs1 == rs) # same result

# also with one loop, as everywhere on the internet:
for i in xrange(Xlen):
    rs1[i, :] = np.sum(Y + X[i], axis=1)
    # ValueError: could not broadcast input array from shape (8,4,3) into shape (8)

Answer 1

axis : None or int or tuple of ints, optional
    ...
    If axis is a tuple of ints, a sum is performed on all of the axes
    specified in the tuple instead of a single axis or all the axes as
    before.

使用元组的能力是一个补充（v1.7,2013）。我没有太多使用它，当我在MATLAB中需要它时，我使用了重复的总和，例如。

In [149]: arr = np.arange(24).reshape(2,3,4)
In [150]: arr.sum(axis=(1,2))
Out[150]: array([ 66, 210])
In [151]: arr.sum(axis=2).sum(axis=1)
Out[151]: array([ 66, 210])

在进行连续求和时，您需要记住轴的数量会发生变化（除非您使用keepdims，这本身就是一个新参数）。

您的X,Y总和：

In [160]: rs = np.zeros((Xlen, Ylen),int)
     ...: rs1 = np.zeros((Xlen, Ylen),int)
     ...: 
     ...: # calculate the result with 2 loops
     ...: for i in range(Xlen):
     ...:   for j in range(Ylen):
     ...:     rs[i,j] = np.sum(X[i] + Y[j])
     ...: 
In [161]: rs
Out[161]: 
array([[ 2544,  4848,  7152,  9456, 11760, 14064, 16368, 18672],
       [ 4848,  7152,  9456, 11760, 14064, 16368, 18672, 20976],
       [ 7152,  9456, 11760, 14064, 16368, 18672, 20976, 23280],
       [ 9456, 11760, 14064, 16368, 18672, 20976, 23280, 25584],
       [11760, 14064, 16368, 18672, 20976, 23280, 25584, 27888]])

可以在没有循环的情况下进行复制。

In [162]: X.sum((1,2,3))
Out[162]: array([ 1176,  3480,  5784,  8088, 10392])
In [163]: Y.sum((1,2,3))
Out[163]: array([ 1368,  3672,  5976,  8280, 10584, 12888, 15192, 17496])
In [164]: X.sum((1,2,3))[:,None] + Y.sum((1,2,3))
Out[164]: 
array([[ 2544,  4848,  7152,  9456, 11760, 14064, 16368, 18672],
       [ 4848,  7152,  9456, 11760, 14064, 16368, 18672, 20976],
       [ 7152,  9456, 11760, 14064, 16368, 18672, 20976, 23280],
       [ 9456, 11760, 14064, 16368, 18672, 20976, 23280, 25584],
       [11760, 14064, 16368, 18672, 20976, 23280, 25584, 27888]])

np.sum(X[i] + Y[j]) =＆gt; np.sum(X[i]) + np.sum(Y[j])。 sum(X[i])对X[i]的所有元素求和（axis = None）。除了第1个X.sum(axis=(1,2,3))[i]之外，所有轴上的总和相同。

In [165]: X[0].sum()
Out[165]: 1176
In [166]: X.sum((1,2,3))[0]
Out[166]: 1176
In [167]: X.sum(1).sum(1).sum(1)[0]
Out[167]: 1176

关于广播错误，请查看以下内容：

In [168]: rs1[i,:]
Out[168]: array([0, 0, 0, 0, 0, 0, 0, 0])   # shape (8,)
In [169]: (Y+X[i]).shape    # (8,4,4,3) + (4,4,3)
Out[169]: (8, 4, 4, 3)
In [170]: (Y+X[i]).sum(1).shape    # sums axis 1, ie one of the 4's
Out[170]: (8, 4, 3)

Answer 2

要仅使用axis=1编写相同的结果，我们可以预先重塑数据集。

X = np.reshape(X, (X.shape[0], -1))
Y = np.reshape(Y, (Y.shape[0], -1))

for i in xrange(Xlen):
    rs[i, :] = np.sum(Y + X[i], axis=1)
print(rs)

结果：

[[  2544.   4848.   7152.   9456.  11760.  14064.  16368.  18672.]
 [  4848.   7152.   9456.  11760.  14064.  16368.  18672.  20976.]
 [  7152.   9456.  11760.  14064.  16368.  18672.  20976.  23280.]
 [  9456.  11760.  14064.  16368.  18672.  20976.  23280.  25584.]
 [ 11760.  14064.  16368.  18672.  20976.  23280.  25584.  27888.]]

具有轴行为的python numpy sum函数

2 个答案: