我使用一个numpy数组作为坐标系。坐标均为int类型。我的函数检查x和y坐标是否与输入坐标相同,并返回该元素在numpy数组中的索引。
这是我脚本中的一个独立功能,当我在其上运行cProfile脚本时,这是我得到的最慢的功能。
我的问题是,有人知道检查Nx2 numpy数组的更快方法吗?
这是功能:
def findPerson(coordinate, input_array):
return np.where( (input_array[:,0] == coordinate[0]) & (input_array[:,1] == coordinate[1]) )[0]
编辑: 根据要求,这是一个测试样品。
predator = np.array([1, 1])
people_array = np.array([[-1, 1], [2, 2], [1, 1]])
print(findPerson(predator, people_array))
此测试脚本会打印出[2],这是people_array中与“捕食者”相同坐标的索引
答案 0 :(得分:1)
我的基准测试
<!DOCTYPE html>
<html>
<head>
<script>
var a = 0;
if (a >= 0) {
return "Within range";
}
if (a > 5) {
return "Within range";
}
</script>
</head>
</html>
我的设置:
|---------------------|---------|
|Algorithm |Time |
|---------------------|---------|
|Original (findPerson)| 8.4 ms|
|Firman (find_idx) | 41.8 ms|
|Numba | 4.89 ms|
|Numba parallel | 3.21 ms|
|Cython | 2.70 ms|
|Cython parallel | 2.90 ms|
|---------------------|---------|
我的数据设置(原始数组太小,因此我决定将其复制100万次):
Processor: Intel i5-9600K 3.70GHz, 6 core
Versions:
Python: 3.8.0
Numba: 0.46.0
Cython: 0.29.14
Numpy: 1.17.4
所以最快的选择是使用Cython(代码在Jupyter中工作):
predator = np.array([1, 1])
people_array = np.array([[-1, 1], [2, 2], [1, 1]]*1000000)
最容易设置的是Numba non-parallel:
%%cython -a
import cython
import numpy as np
cimport numpy as np
from numpy cimport ndarray
from cython.parallel import prange
from libc.stdint cimport uint32_t, int64_t
@cython.boundscheck(False)
@cython.wraparound(False)
@cython.infer_types(True)
@cython.initializedcheck(False)
def cy_loop(int64_t[:] coordinate,int64_t[:,:] input_array):
alen = input_array.shape[0]
cdef uint32_t[:] res1 = np.empty(alen,np.uint32)
cdef uint32_t ii = 0
for i in range(alen):
if (input_array[i,0] == coordinate[0]) and (input_array[i,1] == coordinate[1]):
res1[ii] = i
ii = ii + 1
return np.asarray(res1[:ii])
Numba并行(也许也很容易设置,但是并行程序可能很棘手):
@numba.njit(nogil=True)
def findPerson_nb2(coordinate, input_array):
return np.where( (input_array[:,0] == coordinate[0]) & (input_array[:,1] == coordinate[1]) )[0]
Cython并行:
import math
@numba.njit(parallel=True)
def findPerson_nb4(coordinate, input_array, alen):
# alen = input_array.shape[0]
n_batches = 768 #6*16*8
batch_size = math.ceil(alen/n_batches)
res = np.empty((n_batches, batch_size),dtype=np.int64)
res_len = np.empty(n_batches,dtype=np.int64)
for i in numba.prange(n_batches):
start = i*batch_size
if i == (n_batches - 1):
end = alen
else:
end = (i+1)*batch_size
res_i = start + np.where( (input_array[start:end,0] == coordinate[0]) & (input_array[start:end,1] == coordinate[1]))[0]
ailen = res_i.shape[0]
res[i,:ailen] = res_i
res_len[i] = ailen
return res, res_len
@numba.njit()
def myconcat(a_in, a_in_len, alen):
res = np.empty(alen,dtype=np.int64)
ii = 0
for i in range(res_len.shape[0]):
for j in range(res_len[i]):
res[ii] = a_in[i,j]
ii = ii + 1
return res, ii
我的测试:
%%cython -a --compile-args=/openmp --link-args=/openmp --force
import cython
import numpy as np
cimport numpy as np
from numpy cimport ndarray
from cython.parallel import prange
from libc.stdint cimport uint32_t, int64_t
from libc.math cimport ceil
@cython.boundscheck(False)
@cython.wraparound(False)
@cython.infer_types(True)
@cython.initializedcheck(False)
def cy_loop3(int64_t[:] coordinate,int64_t[:,:] input_array):
alen = input_array.shape[0]
cdef uint32_t n_batches = 6*16 #6*16*8
cdef uint32_t batch_size = <uint32_t>ceil(alen/n_batches)
cdef uint32_t[:,:] res = np.empty((n_batches, batch_size),dtype=np.uint32)
cdef uint32_t[:] res_len = np.empty(n_batches,dtype=np.uint32)
cdef uint32_t start, end, ii, i, j
for i in prange(n_batches, nogil=True):
start = i*batch_size
if i == (n_batches - 1):
end = alen
else:
end = (i+1)*batch_size
ii = 0
for j in range(start,end):
if (input_array[j,0] == coordinate[0]) and (input_array[j,1] == coordinate[1]):
res[i, ii] = j
ii = ii + 1
res_len[i] = ii
return np.asarray(res), np.asarray(res_len)
答案 1 :(得分:0)
你可以试试吗?
import numpy as np
def find_idx(pt, ptslist):
return np.where(np.all(pt == ptslist, axis=1))[0]
它还可以用于2个以上维度。