Question

我目前正在使用相当大的3D点（x，y，z）数据集，并且想要一种有效的方法来识别哪些点在xy平面的一组圆内，半径为r，中心为（ x1，y1），其中x1和y1是网格坐标（每个长度为120）。圆圈将重叠，并且某些点将属于多个圆圈。

因此，输出将为14400个圆的标识（120 * 120），并且（x，y，z）列表中的每个点都在每个圆中。

import numpy as np

def inside_circle(x, y, x0, y0, r):
    return (x - x0)*(x - x0) + (y - y0)*(y - y0) < r*r

x = np.random.random_sample((10000,))
y = np.random.random_sample((10000,))

x0 = np.linspace(min(x),max(x),120)
y0 = np.linspace(min(y),max(y),120)

idx = np.zeros((14400,10000))
r = 2
count = 0

for i in range(0,120):
    for j in range(0,120):
        idx[count,:] = inside_circle(x,y,x0[i],y0[j],r)
        count = count + 1

其中inside_circle是一个函数，它为半径为r的圆中每个被测点x，y，z提供一个布尔True或False数组，其中心为x0 [i]和x0 [j]

我的主要问题是，是否有比嵌套的for循环更有效的方法？甚至一般来说，更有效的方法是在这里执行任何操作-因为我是python的新手。

感谢您的回复！

Alec。

Answer 1

这是一个使用数组广播的方法，它比嵌套的for循环要快一些（在我的机器上是0.5s对0.8s）。尽管我认为可读性有所下降。

import numpy as np

x = np.random.random_sample((1, 10000))
y = np.random.random_sample((1, 10000))

x0 = np.reshape(np.linspace(np.min(x),np.max(x),120), (120, 1))
y0 = np.reshape(np.linspace(np.min(y),np.max(y),120), (120, 1))

r = 2

all_xdiffssquared = np.subtract(x, x0)**2
all_ydiffssquared = np.subtract(y, y0)**2
# 3d here means that the array has 3 dimensions. Not the geometry described
all_xdiffssquared_3d = np.reshape(all_xdiffssquared, (120, 10000, 1))
all_ydiffssquared_3d = np.reshape(np.transpose(all_ydiffssquared), (1, 10000, 120))
all_distances_3d = all_xdiffssquared_3d + all_ydiffssquared_3d - r**2
idx = np.signbit(np.reshape(np.moveaxis(all_distances_3d, 1, -1), (14400, 10000)))

将2D数据合并到x，y

1 个答案: