我有一组未分类的数字。
我需要用特定的替代品替换某些数字(在列表中给出)(也在相应的列表中给出)
我编写了以下代码(似乎有效):
import numpy as np
numbers = np.arange(0,40)
np.random.shuffle(numbers)
problem_numbers = [33, 23, 15] # table, night_stand, plant
alternative_numbers = [12, 14, 26] # desk, dresser, flower_pot
for i in range(len(problem_numbers)):
idx = numbers == problem_numbers[i]
numbers[idx] = alternative_numbers[i]
然而,这似乎非常低效(对于更大的阵列,这需要做数百万次)。
我发现this问题回答了类似的问题但是在我的情况下,数字没有排序,他们需要保持原来的位置。
注意:numbers
可能包含problem_numbers
答案 0 :(得分:2)
problem_values
都在numbers
,并且甚至可能多次出现:在这种情况下,我只会使用dict
来保留要替换的值,并使用dict.get
来转换有问题的数字:
replacer = dict(zip(problem_numbers, alternative_numbers))
numbers_list = numbers.tolist()
numbers = np.array(list(map(replacer.get, numbers_list, numbers_list)))
即使它必须通过Python"这几乎是自我解释的,并且它比NumPy解决方案(可能)慢得多。
problem_value
阵列中实际存在numbers
而 ,则如果您拥有numpy_indexed
套餐,则可以使用numpy_indexed.indices
:
>>> import numpy_indexed as ni
>>> numbers[ni.indices(numbers, problem_numbers)] = alternative_numbers
即使对于大型阵列,这也应该非常有效。
答案 1 :(得分:2)
这是一种简单的方法:
import numpy as np
numbers = np.arange(0,40)
np.random.shuffle(numbers)
problem_numbers = [33, 23, 15] # table, night_stand, plant
alternative_numbers = [12, 14, 26] # desk, dresser, flower_pot
# Replace values
problem_numbers = np.asarray(problem_numbers)
alternative_numbers = np.asarray(alternative_numbers)
n_min, n_max = numbers.min(), numbers.max()
replacer = np.arange(n_min, n_max + 1)
mask = problem_numbers <= n_max # Discard replacements out of range
replacer[problem_numbers[mask] - n_min] = alternative_numbers[mask]
numbers = replacer[numbers - n_min]
这很有效,只要numbers
(最小和最大之间的差异)的值范围不大(例如,你没有像{{{{{{{ 1}},1
和7
)。
<强>基准强>
我已经将OP中的代码与使用此代码的三个(截至目前)建议的解决方案进行了比较:
10000000000
结果:
import numpy as np
def method_itzik(numbers, problem_numbers, alternative_numbers):
numbers = np.asarray(numbers)
for i in range(len(problem_numbers)):
idx = numbers == problem_numbers[i]
numbers[idx] = alternative_numbers[i]
return numbers
def method_mseifert(numbers, problem_numbers, alternative_numbers):
numbers = np.asarray(numbers)
replacer = dict(zip(problem_numbers, alternative_numbers))
numbers_list = numbers.tolist()
numbers = np.array(list(map(replacer.get, numbers_list, numbers_list)))
return numbers
def method_divakar(numbers, problem_numbers, alternative_numbers):
numbers = np.asarray(numbers)
problem_numbers = np.asarray(problem_numbers)
problem_numbers = np.asarray(alternative_numbers)
# Pre-process problem_numbers and correspondingly alternative_numbers
# such that repeats and no matches are taken care of
sidx_pn = problem_numbers.argsort()
pn = problem_numbers[sidx_pn]
mask = np.concatenate(([True],pn[1:] != pn[:-1]))
an = alternative_numbers[sidx_pn]
minN, maxN = numbers.min(), numbers.max()
mask &= (pn >= minN) & (pn <= maxN)
pn = pn[mask]
an = an[mask]
# Pre-pocessing done. Now, we need to use pn and an in place of
# problem_numbers and alternative_numbers repectively. Map, index and assign.
sidx = numbers.argsort()
idx = sidx[np.searchsorted(numbers, pn, sorter=sidx)]
valid_mask = numbers[idx] == pn
numbers[idx[valid_mask]] = an[valid_mask]
def method_jdehesa(numbers, problem_numbers, alternative_numbers):
numbers = np.asarray(numbers)
problem_numbers = np.asarray(problem_numbers)
alternative_numbers = np.asarray(alternative_numbers)
n_min, n_max = numbers.min(), numbers.max()
replacer = np.arange(n_min, n_max + 1)
mask = problem_numbers <= n_max # Discard replacements out of range
replacer[problem_numbers[mask] - n_min] = alternative_numbers[mask]
numbers = replacer[numbers - n_min]
return numbers