我有一个Python Numpy数组,它是一个2D数组,其中第二个维是一个3个整数元素的子数组。例如:
[ [2, 3, 4], [9, 8, 7], ... [15, 14, 16] ]
对于每个子阵列,我想用1代替最小数字,用0代替所有其他数字。所以上面例子的所需输出是:
[ [1, 0, 0], [0, 0, 1], ... [0, 1, 0] ]
这是一个大型数组,所以我想利用Numpy性能。我知道使用条件来操作数组元素,但是当条件是动态的时候我该怎么做?在这种情况下,条件必须类似于:
newarray = (a == min(a)).astype(int)
但我如何在每个子阵列中执行此操作?
答案 0 :(得分:1)
您可以指定axis
参数来计算 mins的二维数组(如果您保留结果的维度),那么当您执行a == a.minbyrow
时,您将在每个子阵列的最小位置得到trues:
(a == a.min(1, keepdims=True)).astype(int)
#array([[1, 0, 0],
# [0, 0, 1],
# [0, 1, 0]])
答案 1 :(得分:1)
这个怎么样?
import numpy as np
a = np.random.random((4,3))
i = np.argmin(a, axis=-1)
out = np.zeros(a.shape, int)
out[np.arange(out.shape[0]), i] = 1
print(a)
print(out)
示例输出:
# [[ 0.58321885 0.18757452 0.92700724]
# [ 0.58082897 0.12929637 0.96686648]
# [ 0.26037634 0.55997658 0.29486454]
# [ 0.60398426 0.72253012 0.22812904]]
# [[0 1 0]
# [0 1 0]
# [1 0 0]
# [0 0 1]]
它似乎比直接方法快一点:
from timeit import timeit
def dense():
return (a == a.min(1, keepdims=True)).astype(int)
def sparse():
i = np.argmin(a, axis=-1)
out = np.zeros(a.shape, int)
out[np.arange(out.shape[0]), i] = 1
return out
for shp in ((4,3), (10000,3), (100,10), (100000,1000)):
a = np.random.random(shp)
d = timeit(dense, number=40)/40
s = timeit(sparse, number=40)/40
print('shape, dense, sparse, ratio', '({:6d},{:6d}) {:9.6g} {:9.6g} {:9.6g}'.format(*shp, d, s, d/s))
示例运行:
# shape, dense, sparse, ratio ( 4, 3) 4.22172e-06 3.1274e-06 1.34992
# shape, dense, sparse, ratio ( 10000, 3) 0.000332396 0.000245348 1.35479
# shape, dense, sparse, ratio ( 100, 10) 9.8944e-06 5.63165e-06 1.75693
# shape, dense, sparse, ratio (100000, 1000) 0.344177 0.189913 1.81229