numpy数组:快速元素比较和设置?

时间:2018-01-18 22:42:33

标签: python arrays numpy

是否有一个函数允许我快速比较和设置numpy数组中的值与固定值?

例如,假设我有一个数值如下的数组:

0 0 0 3 7 3 0 0 0

我想说:从索引位置[3到索引位置[7,如果它低于5,则将值设置为5.结果将是:

0 0 0 5 7 5 5 0 0

我问的原因是因为在做这个操作时#34;手工"在一个循环中,事情似乎是超级的。例如,以下代码需要大约90秒才能对100万个元素数组中的64个连续元素执行100万次这样的操作:

import numpy as np
import random

tsize = 1000000
arr = np.zeros(tsize, dtype=np.uint32)

for rounds in range(tsize):
    num = random.randint(1, 123456)        # generate a random number
    apos = random.randint(0, tsize - 64)   # a random position
    for kpos in range(apos, apos + 64):    # loop to compare and set 64 elements
        if arr[kpos] < num:
            arr[kpos] = num

如果没有这样的功能:上面的代码中是否有任何明显的NumPy新手错误会让它变慢?

2 个答案:

答案 0 :(得分:2)

for循环可以用切片和赋值替换,如下所示:

arr[apos:apos+64] = np.clip(arr[apos:apos+64], a_min=num, a_max=None)

还可以使用np.maximum

arr[apos:apos+64] = np.maximum(arr[apos:apos+64], num)

时序

import numpy as np
import random
​
tsize = 1000
arr = np.zeros(tsize, dtype=np.uint32)

%%timeit
for rounds in range(tsize):
    num = random.randint(1, 123456)        # generate a random number
    apos = random.randint(0, tsize - 64)   # a random position
    for kpos in range(apos, apos + 64):    # loop to compare and set 64 elements
        if arr[kpos] < num:
            arr[kpos] = num
# 10 loops, best of 3: 107 ms per loop

%%timeit
for rounds in range(tsize):
    num = random.randint(1, 123456)        # generate a random number
    apos = random.randint(0, tsize - 64)   # a random position
    arr[apos:apos+64] = np.clip(arr[apos:apos+64], a_min=num, a_max=None)
# 100 loops, best of 3: 4.14 ms per loop

%%timeit
for rounds in range(tsize):
    num = random.randint(1, 123456)        # generate a random number
    apos = random.randint(0, tsize - 64)   # a random position
    arr[apos:apos+64] = np.maximum(arr[apos:apos+64], num)
# 100 loops, best of 3: 4.13 ms per loop

# @Alexander's soln
%%timeit
for rounds in range(tsize):
    num = random.randint(1, 123456)        # generate a random number
    apos = random.randint(0, tsize - 64)   # a random position
    arr[apos:apos+64] = arr[apos:apos+64].clip(min=num)
# 100 loops, best of 3: 3.69 ms per loop

答案 1 :(得分:2)

您可以将clip与数组索引配合使用。

a = np.array([0, 0, 0, 3, 7, 3, 0, 0, 0])
a[3:7] = a[3:7].clip(min=5)
>>> a
array([0, 0, 0, 5, 7, 5, 5, 0, 0])