如何在Python中向量化重心坐标的计算

时间:2019-09-10 03:20:07

标签: python numpy scipy vectorization delaunay

在scipy.spatial中具有Delaunay函数。该文档包含一个example of how to calculate barycentric coordinates.

在该示例之后,以下代码将使用循环来计算重心坐标。

points = np.array([(0,0),(0,1),(1,0),(1,1)])
samples = np.array([(0.5,0.5),(0,0),(0.1,0.1)])

dim    = len(points[0])               # determine the dimension of the samples
simp   = Delaunay(points)             # create simplexes for the defined points
s      = simp.find_simplex(samples)   # for each sample, find corresponding simplex for each sample
b0      = np.zeros((len(samples),dim)) # reserve space for each barycentric coordinate
for ii in range(len(samples)):
    b0[ii,:] = simp.transform[s[ii],:dim].dot((samples[ii] - simp.transform[s[ii],dim]).transpose())
coord = np.c_[b0, 1 - b0.sum(axis=1)]

这对于将短样本列表转换为重心坐标是可以的,但是对于非常大的样本列表,性能很差。如何修改它以利用numpy / scipy中的矢量化数学来提高性能?

1 个答案:

答案 0 :(得分:1)

请考虑以下修改(用numpy方法替换for循环):

def f_1(points, samples):
    """ original """

    dim = len(points[0])
    simp = ssp.Delaunay(points)
    s = simp.find_simplex(samples)
    b0 = np.zeros((len(samples), dim))

    for ii in range(len(samples)):
        b0[ii, :] = simp.transform[s[ii], :dim].dot(
            (samples[ii] - simp.transform[s[ii], dim]).transpose())
    coord = np.c_[b0, 1 - b0.sum(axis=1)]

    return coord

def f_2(points, samples):
    """ modified """

    simp = ssp.Delaunay(points)
    s = simp.find_simplex(samples)

    b0 = (simp.transform[s, :points.shape[1]].transpose([1, 0, 2]) *
          (samples - simp.transform[s, points.shape[1]])).sum(axis=2).T
    coord = np.c_[b0, 1 - b0.sum(axis=1)]

    return coord

测试用例:

N = 100
points = np.array(list(itertools.product(range(N), repeat=2)))
samples = np.random.rand(100_000, 2) * N

结果:

%timeit f_1(points, samples)
712 ms ± 2.76 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit f_2(points, samples)
422 ms ± 809 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

使用修改后的版本,行simp.find_simplex(samples)占用了大约95%的运行时间。因此,我想向量化无能为力。为了进一步提高性能,您需要find_simplex方法的另一种实现或解决该问题的另一种方法。