是否可以删除此函数中的for循环并加快处理速度?使用此功能的矢量方法无法获得相同的结果。还是还有其他选择?
import numpy as np
indices = np.array(
[814, 935, 1057, 3069, 3305, 3800, 4093, 4162, 4449])
within = np.array(
[193, 207, 243, 251, 273, 286, 405, 427, 696,
770, 883, 896, 1004, 2014, 2032, 2033, 2046, 2066,
2079, 2154, 2155, 2156, 2157, 2158, 2159, 2163, 2165,
2166, 2167, 2183, 2184, 2208, 2210, 2212, 2213, 2221,
2222, 2223, 2225, 2226, 2227, 2281, 2282, 2338, 2401,
2611, 2612, 2639, 2640, 2649, 2700, 2775, 2776, 2785,
3030, 3171, 3191, 3406, 3427, 3527, 3984, 3996, 3997,
4024, 4323, 4331, 4332])
def get_first_ind_after(indices, within):
"""returns array of the first index after each listed in indices
indices and within must be sorted ascending
"""
first_after_leading = []
for index in indices:
for w_ind in within:
if w_ind > index:
first_after_leading.append(w_ind)
break
# convert to np array
first_after_leading = np.array(first_after_leading).flatten()
return np.unique(first_after_leading)
如果有一个索引数组,则应该为每个索引数组返回下一个最大数字。
# Output:
[ 883 1004 2014 3171 3406 3984 4323]
答案 0 :(得分:1)
尝试一下:
[within[within>x][0] if len(within[within>x])>0 else 0 for x in indices]
如
In [35]: import numpy as np
...: indices = np.array([814, 935, 1057, 3069, 3305, 3800, 4093, 4162, 4449])
...:
...: within = np.array(
...: [193, 207, 243, 251, 273, 286, 405, 427, 696,
...: 770, 883, 896, 1004, 2014, 2032, 2033, 2046, 2066,
...: 2079, 2154, 2155, 2156, 2157, 2158, 2159, 2163, 2165,
...: 2166, 2167, 2183, 2184, 2208, 2210, 2212, 2213, 2221,
...: 2222, 2223, 2225, 2226, 2227, 2281, 2282, 2338, 2401,
...: 2611, 2612, 2639, 2640, 2649, 2700, 2775, 2776, 2785,
...: 3030, 3171, 3191, 3406, 3427, 3527, 3984, 3996, 3997,
...: 4024, 4323, 4331, 4332])
In [36]: [within[within>x][0] if len(within[within>x])>0 else 0 for x in indices]
Out[36]: [883, 1004, 2014, 3171, 3406, 3984, 4323, 4323, 0]
这是称为list comprehension的pythonic方法,它是foreach
循环的简化版本。因此,如果我要扩展此范围:
result = []
for x in indices:
# This next line is a boolean index into the array, if returns all of the items in the array that have a value greater than x
y = within[within>x]
# At this point, y is an array of all the items which are larger than x. Since you wanted the first of these items, we'll just take the first item off of this new array, but it is possible that y is None (there are no values that match the condition), so there is a check for that
if len(y) > 0:
z = y[0]
else:
z = 0 # or None or whatever you like
# Now add this value to the array that we are building
result.append(z)
# Now result has the array
我这样写是因为它使用向量运算(即布尔掩码)并且还利用列表理解,这是编写返回数组的foreach的更简洁的方法。
答案 1 :(得分:1)
这是基于np.searchsorted
-
def next_greater(indices, within):
idx = np.searchsorted(within, indices)
idxv = idx[idx<len(within)]
idxv_unq = np.unique(idxv)
return within[idxv_unq]
或者,idxv_unq
可以这样计算,并且应该更有效-
idxv_unq = idxv[np.r_[True,idxv[:-1] != idxv[1:]]]