目标:将数组p1合并到p10以创建一个名为“a”的大型数组,并返回“a”中显示为'a'的所有值,共4次。
问题:由于要完成所有循环,此代码速度非常慢,如何让它更快捷?矢量化和/或广播是否有助于提高效率(是否有可能摆脱所有循环)?或任何其他开箱即用的想法?
import numpy as np
import itertools
from numba import jit
p1 = np.random.randint(0,314000,200000)
p2 = np.random.randint(0,314000,100000)
p3 = np.random.randint(0,314000,300000)
p4 = np.random.randint(0,314000,150000)
p5 = np.random.randint(0,314000,220000)
p6 = np.random.randint(0,314000,320000)
p7 = np.random.randint(0,314000,212100)
p8 = np.random.randint(0,314000,100500)
p9 = np.random.randint(0,314000,300700)
p10 = np.random.randint(0,314000,200300)
@jit
def count(a,n):
counters=np.zeros(10**6,np.int32)
for i in a:
counters[i] += 1
res=np.empty_like(counters)
k = 0
for i,j in enumerate(counters):
if j == n:
res[k] = i
k += 1
return res[:k]
for t in range(0, 20000):
a = itertools.chain(p1,p2,p3,p4,p5,p6,p7,p8,p9,p10)
count(a,4)
答案 0 :(得分:3)
是的,是的。你可以摆脱循环,它会加速事情:
>>> a = np.concatenate([p1,p2,p3,p4,p5,p6,p7,p8,p9,p10])
>>> np.flatnonzero(np.bincount(a, minlength=314000)==4)
array([ 29, 33, 38, ..., 313944, 313949, 313973])