我有一个大型2D np.array
(vec
)。
我想用较短数组vec
中最接近的值替换vals
中的每个值。
我尝试了以下
replaced_vals=vals[np.argmin(np.abs(vec[:, np.newaxis] - vals), axis=0)]
但是它不起作用,因为vec
和vals
的大小不同。
示例输入
vec = np.array([10.1,10.7,11.4,102,1100]
vals = np.array([10.0,11.0,100.0])
所需的输出:
replaced_vals = [10.0,11.0,11.0,100.0,100.0]
答案 0 :(得分:2)
您必须沿着另一个轴看,以获得所需的值,如下所示:
replaced_vals=vals[np.argmin(np.abs(vec[:, np.newaxis] - vals), axis=1)]
您的问题的输出:
array([ 10., 11., 11., 100., 100.])
答案 1 :(得分:2)
如果您对vals
数组进行了排序,则可以通过np.searchsorted
来实现更高的内存使用效率,并且可能通常更有效:
def jpp(vec, vals):
ss = np.searchsorted(vals, vec)
a = vals[ss - 1]
b = vals[np.minimum(len(vals) - 1, ss)]
return np.where(np.fabs(vec - a) < np.fabs(vec - b), a, b)
vec = np.array([10.1,10.7,11.4,102,1100])
vals = np.array([10.0,11.0,100.0])
print(jpp(vec, vals))
[ 10. 11. 11. 100. 100.]
# Python 3.6.0, NumPy 1.11.3
n = 10**6
vec = np.array([10.1,10.7,11.4,102,1100]*n)
vals = np.array([10.0,11.0,100.0])
# @ThomasPinetz's solution, memory inefficient
def tho(vec, vals):
return vals[np.argmin(np.abs(vec[:, np.newaxis] - vals), axis=1)]
def jpp(vec, vals):
ss = np.searchsorted(vals, vec)
a = vals[ss - 1]
b = vals[np.minimum(len(vals) - 1, ss)]
return np.where(np.fabs(vec - a) < np.fabs(vec - b), a, b)
# @Divakar's solution, adapted from first related Q&A link
def diva(A, B):
L = B.size
sorted_idx = np.searchsorted(B, A)
sorted_idx[sorted_idx==L] = L-1
mask = (sorted_idx > 0) & \
((np.abs(A - B[sorted_idx-1]) < np.abs(A - B[sorted_idx])) )
return B[sorted_idx-mask]
assert np.array_equal(tho(vec, vals), jpp(vec, vals))
assert np.array_equal(tho(vec, vals), diva(vec, vals))
%timeit tho(vec, vals) # 366 ms per loop
%timeit jpp(vec, vals) # 295 ms per loop
%timeit diva(vec, vals) # 334 ms per loop
答案 2 :(得分:1)
如果对vals
进行了排序,则如果满足以下条件,则必须将x_k
中的vec
舍入为y_i
中的vals
:
(y_(i-1)+y_i)/2 <= x_k < (y_i+y_(i+1))/2.
所以,还有一个使用np.searchsorted
的解决方案,但是将操作减至最少并且至少快了两倍:
def bm(vec, vals):
half = vals.copy() / 2
half[:-1] += half[1:]
half[-1] = np.inf
ss = np.searchsorted(half,vec)
return vals[ss]
%timeit bm(vec, vals) # 84 ms per loop
如果还对vals
进行了排序,则可以用numba
完成另一个工作:
from numba import njit
@njit
def bmm(vec,vals):
half=vals.copy()/2
half[:-1] += half[1:]
half[-1]=np.inf
res=np.empty_like(vec)
i=0
for k in range(vec.size):
while half[i]<vec[k]:
i+=1
res[k]=vals[i]
return res
%timeit bmm(vec, vals) # 31 ms per loop