我正在使用Numpy生成Meshgrid,它占用了大量内存和相当多的时间。
xi, yi = np.meshgrid(xi, yi)
我生成的网格网格与底层网站地图图像的分辨率相同,有时为3000像素尺寸。它有时使用几次内存,在将其写入页面文件时需要10-15秒或更长时间。
我的问题是;我可以在不升级服务器的情况下加快速度吗?这是我的应用程序源代码的完整副本。
def generateContours(date_collected, substance_name, well_arr, site_id, sitemap_id, image, title_wildcard='', label_over_well=False, crop_contours=False, groundwater_contours=False, flow_lines=False, site_image_alpha=1, status_token=""):
#create empty arrays to fill up!
x_values = []
y_values = []
z_values = []
#iterate over wells and fill the arrays with well data
for well in well_arr:
x_values.append(well['xpos'])
y_values.append(well['ypos'])
z_values.append(well['value'])
#initialize numpy array as required for interpolation functions
x = np.array(x_values, dtype=np.float)
y = np.array(y_values, dtype=np.float)
z = np.array(z_values, dtype=np.float)
#create a list of x, y coordinate tuples
points = zip(x, y)
#create a grid on which to interpolate data
start_time = time.time()
xi, yi = np.linspace(0, image['width'], image['width']), np.linspace(0, image['height'], image['height'])
xi, yi = np.meshgrid(xi, yi)
#interpolate the data with the matlab griddata function (http://matplotlib.org/api/mlab_api.html#matplotlib.mlab.griddata)
zi = griddata(x, y, z, xi, yi, interp='nn')
#create a matplotlib figure and adjust the width and heights to output contours to a resolution very close to the original sitemap
fig = plt.figure(figsize=(image['width']/72, image['height']/72))
#create a single subplot, just takes over the whole figure if only one is specified
ax = fig.add_subplot(111, frameon=False, xticks=[], yticks=[])
#read the database image and save to a temporary variable
im = Image.open(image['tmpfile'])
#place the sitemap image on top of the figure
ax.imshow(im, origin='upper', alpha=site_image_alpha)
#figure out a good linewidth
if image['width'] > 2000:
linewidth = 3
else:
linewidth = 2
#create the contours (options here http://cl.ly/2X0c311V2y01)
kwargs = {}
if groundwater_contours:
kwargs['colors'] = 'b'
CS = plt.contour(xi, yi, zi, linewidths=linewidth, **kwargs)
for key, value in enumerate(CS.levels):
if value == 0:
CS.collections[key].remove()
#add a streamplot
if flow_lines:
dy, dx = np.gradient(zi)
plt.streamplot(xi, yi, dx, dy, color='c', density=1, arrowsize=3, arrowstyle='<-')
#add labels to well locations
label_kwargs = {}
if label_over_well is True:
label_kwargs['manual'] = points
plt.clabel(CS, CS.levels[1::1], inline=5, fontsize=math.floor(image['width']/100), fmt="%.1f", **label_kwargs)
#add scatterplot to show where well data was read
scatter_size = math.floor(image['width']/20)
plt.scatter(x, y, s=scatter_size, c='k', facecolors='none', marker=(5, 1))
try:
site_name = db_session.query(Sites).filter_by(site_id=site_id).first().title
except:
site_name = "Site Map #%i" % site_id
sitemap = SiteMaps.query.get(sitemap_id)
if sitemap.title != 'Sitemap':
sitemap_wildcard = " - " + sitemap.title
else:
sitemap_wildcard = ""
if title_wildcard != '':
filename_wildcard = "-" + slugify(title_wildcard)
title_wildcard = " - " + title_wildcard
else:
filename_wildcard = ""
title_wildcard = ""
#add descriptive title to the top of the contours
title_font_size = math.floor(image['width']/72)
plt.title(parseDate(date_collected) + " - " + site_name + " " + substance_name + " Contour" + sitemap_wildcard + title_wildcard, fontsize=title_font_size)
#generate a unique filename and save to a temp directory
filename = slugify(site_name) + str(int(time.time())) + filename_wildcard + ".pdf"
temp_dir = tempfile.gettempdir()
tempFileObj = temp_dir + "/" + filename
savefig(tempFileObj) # bbox_inches='tight' tightens the white border
#clears the matplotlib memory
clf()
#send the temporary file to the user
resp = make_response(send_file(tempFileObj, mimetype='application/pdf', as_attachment=True, attachment_filename=filename))
#set the users status token for javascript workaround to check if file is done being generated
resp.set_cookie('status_token', status_token)
return resp
答案 0 :(得分:6)
如果meshgrid
正在减慢你的速度,请不要打电话......根据griddata
docs:
xi和yi必须描述一个规则的网格,可以是1D或2D,但是 必须单调增加。
因此,如果您跳过griddata
的来电并执行此操作,那么您对meshgrid
的通话应该会相同:
xi = np.linspace(0, image['width'], image['width'])
yi = np.linspace(0, image['height'], image['height'])
zi = griddata(x, y, z, xi, yi, interp='nn')
这就是说,如果你的x
和y
向量很大,那么实际插值,即对griddata
的调用可能需要相当长的时间,因为Delaunay三角剖分是计算密集型操作。您确定您的性能问题来自meshgrid
,而不是来自griddata
吗?
答案 1 :(得分:2)
xi, yi = np.meshgrid(xi, yi, copy=False)
怎么样?
这样它只返回原始数组的视图,而不是复制所有数据。
答案 2 :(得分:1)
看起来您可能不需要将xi
和yi
传递给meshgrid
。查看文档字符串,了解您使用xi
和yi
的函数。许多人接受(甚至期望) 1-D 数组。
例如:
In [33]: x
Out[33]: array([0, 0, 0, 1, 1, 1, 2, 2, 2])
In [34]: y
Out[34]: array([0, 1, 2, 0, 1, 2, 0, 1, 2])
In [35]: z
Out[35]: array([0, 1, 4, 1, 2, 5, 2, 3, 6])
In [36]: xi
Out[36]: array([ 0. , 0.5, 1. , 1.5, 2. ])
In [37]: yi
Out[37]:
array([ 0. , 0.33333333, 0.66666667, 1. , 1.33333333,
1.66666667, 2. ])
In [38]: zi = griddata(x, y, z, xi, yi)
In [39]: zi
Out[39]:
array([[ 0. , 0.5 , 1. , 1.5 , 2. ],
[ 0.33333333, 0.83333333, 1.33333333, 1.83333333, 2.33333333],
[ 0.66666667, 1.16666667, 1.66666667, 2.16666667, 2.66666667],
[ 1. , 1.61111111, 2. , 2.61111111, 3. ],
[ 2. , 2.5 , 3. , 3.5 , 4. ],
[ 3. , 3.5 , 4. , 4.5 , 5. ],
[ 4. , 4.5 , 5. , 5.5 , 6. ]])
In [40]: plt.contour(xi, yi, zi)
Out[40]: <matplotlib.contour.QuadContourSet instance at 0x3ba03b0>