让我们设想一个3x4的空NumPy数组,其中你有左上角的坐标和水平和垂直方向的步长。 现在我想知道整个数组的每个单元格中间的坐标。像这样:
为此我实现了一个嵌套的for循环。
In [12]:
import numpy as np
# extent(topleft_x, stepsize_x, 0, topleft_y, 0, stepsize_y (negative since it's top-left)
extent = (5530000.0, 5000.0, 0.0, 807000.0, 0.0, -5000.0)
array = np.zeros([3,4],object)
cols = array.shape[0]
rows = array.shape[1]
# function to apply to each cell
def f(x,y):
return x*extent[1]+extent[0]+extent[1]/2, y*extent[5]+extent[3]+extent[5]/2
# nested for-loop
def nestloop(cols,rows):
for col in range(cols):
for row in range(rows):
array[col,row] = f(col,row)
In [13]:
%timeit nestloop(cols,rows)
100000 loops, best of 3: 17.4 µs per loop
In [14]:
array.T
Out[14]:
array([[(5532500.0, 804500.0), (5537500.0, 804500.0), (5542500.0, 804500.0)],
[(5532500.0, 799500.0), (5537500.0, 799500.0), (5542500.0, 799500.0)],
[(5532500.0, 794500.0), (5537500.0, 794500.0), (5542500.0, 794500.0)],
[(5532500.0, 789500.0), (5537500.0, 789500.0), (5542500.0, 789500.0)]], dtype=object)
但是渴望学习,我该如何优化呢?我在想矢量化或使用lambda。我尝试将其矢量化为:
array[:,:] = np.vectorize(check)(cols,rows)
ValueError: could not broadcast input array from shape (2) into shape (3,4)
但是,我得到了广播错误。目前阵列是3乘4,但也可以变成3000乘4000.
答案 0 :(得分:3)
当然,计算x
和y
坐标的方式非常低效,因为它根本没有矢量化。你可以这样做:
In [1]: import numpy as np
In [2]: extent = (5530000.0, 5000.0, 0.0, 807000.0, 0.0, -5000.0)
...: x_steps = np.array([0,1,2]) * extent[1]
...: y_steps = np.array([0,1,2,3]) * extent[-1]
...:
In [3]: x_coords = extent[0] + x_steps + extent[1]/2
...: y_coords = extent[3] + y_steps + extent[-1]/2
...:
In [4]: x_coords
Out[4]: array([ 5532500., 5537500., 5542500.])
In [5]: y_coords
Out[5]: array([ 804500., 799500., 794500., 789500.])
此时点的坐标由这两个数组的笛卡尔product()
给出:
In [5]: list(it.product(x_coords, y_coords))
Out[5]: [(5532500.0, 804500.0), (5532500.0, 799500.0), (5532500.0, 794500.0), (5532500.0, 789500.0), (5537500.0, 804500.0), (5537500.0, 799500.0), (5537500.0, 794500.0), (5537500.0, 789500.0), (5542500.0, 804500.0), (5542500.0, 799500.0), (5542500.0, 794500.0), (5542500.0, 789500.0)]
你只需要将它们分组4乘4。
要获得numpy
您可以做的产品(基于this回答):
In [6]: np.transpose([np.tile(x_coords, len(y_coords)), np.repeat(y_coords, len(x_coords))])
Out[6]:
array([[ 5532500., 804500.],
[ 5537500., 804500.],
[ 5542500., 804500.],
[ 5532500., 799500.],
[ 5537500., 799500.],
[ 5542500., 799500.],
[ 5532500., 794500.],
[ 5537500., 794500.],
[ 5542500., 794500.],
[ 5532500., 789500.],
[ 5537500., 789500.],
[ 5542500., 789500.]])
哪些可以重塑:
In [8]: product.reshape((3,4,2)) # product is the result of the above
Out[8]:
array([[[ 5532500., 804500.],
[ 5537500., 804500.],
[ 5542500., 804500.],
[ 5532500., 799500.]],
[[ 5537500., 799500.],
[ 5542500., 799500.],
[ 5532500., 794500.],
[ 5537500., 794500.]],
[[ 5542500., 794500.],
[ 5532500., 789500.],
[ 5537500., 789500.],
[ 5542500., 789500.]]])
如果这不是您想要的订单,您可以执行以下操作:
In [9]: ar = np.zeros((3,4,2), float)
...: ar[0] = product[::3]
...: ar[1] = product[1::3]
...: ar[2] = product[2::3]
...:
In [10]: ar
Out[10]:
array([[[ 5532500., 804500.],
[ 5532500., 799500.],
[ 5532500., 794500.],
[ 5532500., 789500.]],
[[ 5537500., 804500.],
[ 5537500., 799500.],
[ 5537500., 794500.],
[ 5537500., 789500.]],
[[ 5542500., 804500.],
[ 5542500., 799500.],
[ 5542500., 794500.],
[ 5542500., 789500.]]])
我相信有更好的方法可以做最后一次重塑,但我不是numpy
专家。
请注意,使用object
作为dtype会导致巨大的性能损失,因为numpy
无法优化任何内容(有时比使用普通list
更慢) 。我使用了(3,4,2)
数组,这样可以更快地进行操作。