给出以下形式的numpy数组:
x = [[4.,3.,2.,1.,8.],[1.2,3.1,0.,9.2,5.5],[0.2,7.0,4.4,0.2,1.3]]
有一种方法可以保留每行的前3个值,并在python中将其他值设置为零(无显式循环)。在上述示例的情况下,结果将是
x = [[4.,3.,0.,0.,8.],[0.,3.1,0.,9.2,5.5],[0.0,7.0,4.4,0.0,1.3]]
示例代码
import numpy as np
arr = np.array([1.2,3.1,0.,9.2,5.5,3.2])
indexes=arr.argsort()[-3:][::-1]
a = list(range(6))
A=set(indexes); B=set(a)
zero_ind=(B.difference(A))
arr[list(zero_ind)]=0
输出:
array([0. , 0. , 0. , 9.2, 5.5, 3.2])
上面是我的一维numpy数组的示例代码(多行)。循环遍历numpy数组的每一行并重复执行相同的计算将非常昂贵。有没有更简单的方法?
答案 0 :(得分:0)
这里是一种使用列表推导来遍历数组并应用keep_top_3函数的方法
import numpy as np
import heapq
def keep_top_3(arr):
smallest = heapq.nlargest(3, arr)[-1] # find the top 3 and use the smallest as cut off
arr[arr < smallest] = 0 # replace anything lower than the cut off with 0
return arr
x = [[4.,3.,2.,1.,8.],[1.2,3.1,0.,9.2,5.5],[0.2,7.0,4.4,0.2,1.3]]
result = [keep_top_3(np.array(arr)) for arr in x]
我希望这会有所帮助:)
答案 1 :(得分:0)
使用np.apply_along_axis
将函数应用于沿给定轴的一维切片
drive.file
输出
import numpy as np
def top_k_values(array):
indexes = array.argsort()[-3:][::-1]
A = set(indexes)
B = set(list(range(array.shape[0])))
array[list(B.difference(A))]=0
return array
arr = np.array([[4.,3.,2.,1.,8.],[1.2,3.1,0.,9.2,5.5],[0.2,7.0,4.4,0.2,1.3]])
result = np.apply_along_axis(top_k_values, 1, arr)
print(result)
答案 2 :(得分:0)
这是一个完全矢量化的代码,numpy
之外没有第三方。它使用numpy的argpartition有效地找到第k个值。有关其他用例,请参见例如this answer。
def truncate_top_k(x, k, inplace=False):
m, n = x.shape
# get (unsorted) indices of top-k values
topk_indices = numpy.argpartition(x, -k, axis=1)[:, -k:]
# get k-th value
rows, _ = numpy.indices((m, k))
kth_vals = x[rows, topk_indices].min(axis=1)
# get boolean mask of values smaller than k-th
is_smaller_than_kth = x < kth_vals[:, None]
# replace mask by 0
if not inplace:
return numpy.where(is_smaller_than_kth, 0, x)
x[is_smaller_than_kth] = 0
return x
答案 3 :(得分:0)
def top_k(arr, k, axis = 0):
top_k_idx = = np.take_along_axis(np.argpartition(arr, -k, axis = axis),
np.arange(-k,-1),
axis = axis) # indices of top k values in axis
out = np.zeros.like(arr) # create zero array
np.put_along_axis(out, top_k_idx, # put idx values of arr in out
np.take_along_axis(arr, top_k_idx, axis = axis),
axis = axis)
return out
这应该适用于任意axis
和k
,但不能就地工作。如果您想就地,则更简单:
def top_k(arr, k, axis = 0):
remove_idx = = np.take_along_axis(np.argpartition(arr, -k, axis = axis),
np.arange(arr.shape[axis] - k),
axis = axis) # indices to remove
np.put_along_axis(out, remove_idx, 0, axis = axis) # put 0 in indices