[这篇文章的早期版本绝对没有回应,所以,如果这是由于缺乏清晰度,我已经重新设计了它,附加解释和代码注释。]
我想计算numpy n - 维数组元素的平均值和标准差,这些数组不对应于单个轴(而是 k > 1 非连续轴),并以新的( n - k + 1)维数组收集结果。
numpy是否包含有效执行此操作的标准构造?
下面复制的函数mu_sigma
是我解决此问题的最佳尝试,但它有两个明显的低效率:1)它需要复制原始数据; 2)它计算两次平均值(因为标准差的计算需要计算均值)。
mu_sigma
函数有两个参数:box
和axes
。 box
是 n - 维度numpy数组(又名“ndarray”),axes
是 k - 整数元组,代表(不是box
的维度必然是连续的。该函数返回一个新的( n - k + 1) - 维度ndarray,其中包含由box
表示的“hyperslabs”计算的均值和标准差。 k 指定轴。
以下代码还包含mu_sigma
实例的示例。在此示例中,box
参数是浮点数的4 x 2 x 4 x 3 x 4 ndarray,axes
参数是元组(1,3)。 (因此,我们 n == len(box.shape)
== 5, k == len(axes)
== 2.)结果(在此处)我将调用outbox
)返回此示例输入是一个4 x 4 x 4 x 2 ndarray浮点数。对于每个索引的三元组 i , k , j (其中每个索引的范围超过集合{0,1,2,3}),元素outbox[i, j, k, 0]
是numpy表达式box[i, 0:2, j, 0:3, k]
指定的6个元素的平均值。同样,outbox[i, j, k, 1]
是相同6个元素的标准差。这意味着结果范围的第一个 n - k == 3维度与 n - k相同的索引输入ndarray box
的非轴尺寸,在本例中为尺寸0,2和4。
mu_sigma
中使用的策略是
transpose
方法),以便函数第二个参数中指定的轴全部放在最后;其余(非轴)尺寸保留在开头(按原始顺序排列); reshape
方法);新的“折叠”维度现在是重塑的ndarray的最后一个维度;
import numpy as np
def mu_sigma(box, axes):
inshape = box.shape
# determine the permutation needed to put all the dimensions given in axes
# at the end (otherwise preserving the relative ordering of the dimensions)
nonaxes = tuple([i for i in range(len(inshape)) if i not in set(axes)])
# permute the dimensions
permuted = box.transpose(nonaxes + axes)
# determine the shape of the ndarray after permuting the dimensions and
# collapsing the axes-dimensions; thanks to Bago for the "+ (-1,)"
newshape = tuple(inshape[i] for i in nonaxes) + (-1,)
# collapse the axes-dimensions
# NB: the next line results in copying the input array
reshaped = permuted.reshape(newshape)
# determine the shape for the mean and std ndarrays, as required by
# the subsequent call to np.concatenate (this reshaping is not necessary
# if the available mean and std methods support the keepdims keyword;
# instead, just set keepdims to True in both calls).
outshape = newshape[:-1] + (1,)
# compute the means and standard deviations
mean = reshaped.mean(axis=-1).reshape(outshape)
std = reshaped.std(axis=-1).reshape(outshape)
# collect the results in a single ndarray, and return it
return np.concatenate((mean, std), axis=-1)
inshape = 4, 2, 4, 3, 4
inbuf = np.array(map(float, range(np.product(inshape))))
inbox = np.ndarray(inshape, buffer=inbuf)
outbox = mu_sigma(inbox, tuple(range(len(inshape))[1::2]))
# "inline tests"
assert all(outbox[..., 1].ravel() ==
[inbox[0, :, 0, :, 0].std()] * outbox[..., 1].size)
assert all(outbox[..., 0].ravel() == [float(4*(v + 3*w) + x)
for v in [8*y - 1
for y in [3*z + 1
for z in range(4)]]
for w in range(4)
for x in range(4)])