Question

考虑一个浮点数的NumPy ndarray A，其维数为n，任意形状为D=[d1,...,dn]（di是非负整数）。我该如何填充A以使其具有例如：

A[j1,...,jn]=sqrt(j1*...*jn)

其中0<=ji<=di。如果我知道n并且它是固定的，那么我可以嵌套n for循环以简单地填充ndarray。但是，在我的程序中不是这样，除了效率不高。我想知道是否有办法

填充具有给定任意形状A的ndarray D（例如，使用上面的公式或索引的任何其他非阶函数）。
最好避免使用python for loops，并利用NumPy的基础功能。

谢谢您的帮助。

Answer 1

要意识到的一个重要事实是您可以使用broadcasting to solve this problem efficiently。所以对于2D情况，您可以做到

d1, d2 = (3, 4)
A = numpy.sqrt(numpy.arange(d1)[:, None] * numpy.arange(d2)[None, :])
# array([[0.        , 0.        , 0.        , 0.        ],
#        [0.        , 1.        , 1.41421356, 1.73205081],
#        [0.        , 1.41421356, 2.        , 2.44948974]])

一旦您愿意使用广播来制作这些外部乘积（或，和或比较等），我们可以尝试解决nD情况。

通过查看以上代码的输入数组，我们发现它们具有形状

(d1,  1)
( 1, d2)

因此要在nD中执行此操作，我们需要找到一个采用线性索引数组并自动创建形状数组的方法

(d1,  1,  1, ...)
( 1, d2,  1, ...)
( 1,  1, d3, ...)

Numpy提供了这样的功能：numpy.meshgrid(..., sparse=True)

numpy.meshgrid(numpy.arange(3), numpy.arange(4), sparse=True)

知道这一点后，我们可以将它们全部放在一行中

D = (3, 4, 5)
numpy.sqrt(numpy.prod(numpy.meshgrid(*[numpy.arange(d) for d in D], sparse=True, indexing='ij')))
# array([[[0.        , 0.        , 0.        , 0.        , 0.        ],
#         [0.        , 0.        , 0.        , 0.        , 0.        ],
#         [0.        , 0.        , 0.        , 0.        , 0.        ]],
# 
#        [[0.        , 0.        , 0.        , 0.        , 0.        ],
#         [0.        , 1.        , 1.41421356, 1.73205081, 2.        ],
#         [0.        , 1.41421356, 2.        , 2.44948974, 2.82842712]],
# 
#        [[0.        , 0.        , 0.        , 0.        , 0.        ],
#         [0.        , 1.41421356, 2.        , 2.44948974, 2.82842712],
#         [0.        , 2.        , 2.82842712, 3.46410162, 4.        ]],
# 
#        [[0.        , 0.        , 0.        , 0.        , 0.        ],
#         [0.        , 1.73205081, 2.44948974, 3.        , 3.46410162],
#         [0.        , 2.44948974, 3.46410162, 4.24264069, 4.89897949]]])

性能评估

要评估所有三种解决方案的性能，让我们为几种不同的张量大小计时它们的速度：

D=(2,3,4,5)

%timeit np.fromfunction(function=myfunc2, shape=D)
# 501 µs ± 9.34 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit np.fromfunction(function=creation_function, shape=D)
# 24.2 µs ± 455 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit numpy.sqrt(numpy.prod(numpy.meshgrid(*[numpy.arange(d) for d in D], sparse=True, indexing='ij')))
# 30.9 µs ± 1.02 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

D=(20,30,40,50)

%timeit np.fromfunction(function=myfunc2, shape=D)
# 4.64 s ± 36.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit np.fromfunction(function=creation_function, shape=D)
# 36.7 ms ± 1.17 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit numpy.sqrt(numpy.prod(numpy.meshgrid(*[numpy.arange(d) for d in D], sparse=True, indexing='ij')))
# 9 ms ± 237 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

D=(200,30,40,50)

%timeit np.fromfunction(function=myfunc2, shape=D)
# never completed
%timeit np.fromfunction(function=creation_function, shape=D)
# 508 ms ± 7.41 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit numpy.sqrt(numpy.prod(numpy.meshgrid(*[numpy.arange(d) for d in D], sparse=True, indexing='ij')))
# 88.1 ms ± 1.63 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

D=(200,300,40,50)

%timeit np.fromfunction(function=myfunc2, shape=D)
# never completed
%timeit np.fromfunction(function=creation_function, shape=D)
# 5.8 s ± 565 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit numpy.sqrt(numpy.prod(numpy.meshgrid(*[numpy.arange(d) for d in D], sparse=True, indexing='ij')))
# 1.29 s ± 15.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Answer 2

ibarronds solution的修改版本，不使用labmda，并且可以使用大于3的更高尺寸：

import numpy as np

def myfunc(*J):
    return np.sqrt(np.prod(np.array(J)))
myfunc2=np.vectorize(myfunc)

D=(2,3,4,5)

np.fromfunction(function=myfunc2 , shape=D)

P.S。（不幸的是）他删除了答案，因此我将其复制在这里以供参考：

creation_function = lambda *args: np.sqrt(np.prod(np.array([*args]), axis=0))
np.fromfunction(function=creation_function, shape=D)

填充任意形状的numpy ndarray（最好不使用for循环）

2 个答案:

性能评估