I'm working on a python code where I need to evaluate a 2D spline at an arbitrary set of points many times. The code looks like this:
spline = scipy.interpolate.RectBivariateSpline(...)
for i in range(1000000):
x_points, y_points = data.get_output_points(i)
vals = spline.ev(x_points, y_points)
""" do stuff with vals """
There is no overlap of the output points. I would like to parallelize this using threads or some kind of shared memory since data.get_output_points
uses a lot of memory. Naively, I tried spawning 10 threads and giving them each 1/10 of that loop. However, this doesn't give me any speed-up over running with a single thread.
I profiled the code, and it is spending all of its time in fitpack2.py:674(\__call__)
, which is the _BivariateSplineBase
evaluation function. It seems like I'm running into some GIL issue which is preventing the threads from running independently.
How can I get around the GIL issue and parallelize this? Is there a way to call into the fitpack
routines that will parallelize well, or a different spline that I could use? My input grid is uniform and oversampled, but my output points can be anywhere. I have tried using RegularGridInterpolator
(linear interpolation) which has good enough, although not ideal, performance, but it parallelizes poorly using threads.
EDIT: Here is what I mean by naive thread parallelization:
def worker(start, end):
for i in range(start, end):
x_points, y_points = data.get_output_points(i)
vals = spline.ev(x_points, y_points)
""" do stuff with vals """
t1 = threading.Thread(target=worker, args=(0, 500000)).start()
t2 = threading.Thread(target=worker, args=(500001, 1000000)).start()
t1.join()
t2.join()
答案 0 :(得分:1)