Question

我在研究中使用大型数据集。

我需要在Numpy数组中复制一个元素。下面的代码实现了这一点，但Numpy中是否有一个以更有效的方式执行操作的功能？

"""
Example output
>>> (executing file "example.py")
Choose a number between 1 and 10:
2
Choose number of repetitions:
9
Your output array is:
 [1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 4, 5, 6, 7, 8, 9, 10]

>>>
"""
x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

y = int(input('Choose the number you want to repeat (1-10):\n'))
repetitions = int(input('Choose number of repetitions:\n'))
output = []

for i in range(len(x)):
    if x[i] != y:
        output.append(x[i])
    else:
        for j in range(repetitions):
            output.append(x[i])

print('Your output array is:\n', output)

Answer 1

一种方法是找到要用np.searchsorted重复的元素的索引。使用该索引切割数组的左侧和右侧，并在其间插入重复的数组。

因此，一个解决方案是 -

idx = np.searchsorted(x,y)
out = np.concatenate(( x[:idx], np.repeat(y, repetitions), x[idx+1:] ))

让我们考虑使用x作为 -

的更通用的示例案例

x = [2, 4, 5, 6, 7, 8, 9, 10]

让重复的数字是y = 5和repetitions = 7。

现在，使用建议的代码 -

In [57]: idx = np.searchsorted(x,y)

In [58]: idx
Out[58]: 2

In [59]: np.concatenate(( x[:idx], np.repeat(y, repetitions), x[idx+1:] ))
Out[59]: array([ 2,  4,  5,  5,  5,  5,  5,  5,  5,  6,  7,  8,  9, 10])

对于x始终为[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]的特定情况，我们会有一个更紧凑/更优雅的解决方案，就像这样 -

np.r_[x[:y-1], [y]*repetitions, x[y:]]

Answer 2

有numpy.repeat功能：

>>> np.repeat(3, 4)
array([3, 3, 3, 3])

>>> x = np.array([[1,2],[3,4]])

>>> np.repeat(x, 2)
array([1, 1, 2, 2, 3, 3, 4, 4])

>>> np.repeat(x, 3, axis=1)
array([[1, 1, 1, 2, 2, 2],
       [3, 3, 3, 4, 4, 4]])

>>> np.repeat(x, [1, 2], axis=0)
array([[1, 2],
       [3, 4],
       [3, 4]])

复制列表或Numpy数组中的特定元素

2 个答案: