Question

I'm working on a python code where I need to evaluate a 2D spline at an arbitrary set of points many times. The code looks like this:

spline = scipy.interpolate.RectBivariateSpline(...)
for i in range(1000000):
    x_points, y_points = data.get_output_points(i)
    vals = spline.ev(x_points, y_points)
    """ do stuff with vals """

There is no overlap of the output points. I would like to parallelize this using threads or some kind of shared memory since data.get_output_points uses a lot of memory. Naively, I tried spawning 10 threads and giving them each 1/10 of that loop. However, this doesn't give me any speed-up over running with a single thread.

I profiled the code, and it is spending all of its time in fitpack2.py:674(\__call__), which is the _BivariateSplineBase evaluation function. It seems like I'm running into some GIL issue which is preventing the threads from running independently.

How can I get around the GIL issue and parallelize this? Is there a way to call into the fitpack routines that will parallelize well, or a different spline that I could use? My input grid is uniform and oversampled, but my output points can be anywhere. I have tried using RegularGridInterpolator (linear interpolation) which has good enough, although not ideal, performance, but it parallelizes poorly using threads.

EDIT: Here is what I mean by naive thread parallelization:

def worker(start, end):
    for i in range(start, end):
        x_points, y_points = data.get_output_points(i)
        vals = spline.ev(x_points, y_points)
        """ do stuff with vals """

t1 = threading.Thread(target=worker, args=(0, 500000)).start()
t2 = threading.Thread(target=worker, args=(500001, 1000000)).start()
t1.join()
t2.join()

Answer 1

在python中有多种方法可以并行处理，避免使用GIL：

有关详情，请参阅here

是的，你击中了GIL脖子。

Parallelization of calls to scipy RectBivariateSpline

1 个答案: