Question

我有一个包含3列的文件，其中前两个是坐标（x，y），第三个是与该位置对应的值（z）。这是一个简短的例子：

我想根据文件中的x，y坐标从第三行创建一个二维值数组。我在每列中读取一个单独的数组，并使用numpy.meshgrid创建了x值和y值的网格，如下所示：

x = [[0 1 2]    and   y = [[0 0 0]
     [0 1 2]               [1 1 1]
     [0 1 2]]              [2 2 2]]

但我是Python的新手，并且不知道如何生成第三个z值网格，如下所示：

z = [[Nan 15 Nan]
     [14  16  18]
     [17  Nan 13]]

用Nan替换0也没关系;我的主要问题是首先创建2D数组。在此先感谢您的帮助！

Answer 1

假设文件中的x和y值直接对应于索引（就像在示例中那样），您可以执行与此类似的操作：

import numpy as np

x = [0, 0, 1, 1, 2, 2]
y = [1, 2, 0, 1, 1, 2]
z = [14, 17, 15, 16, 18, 13]

z_array = np.nan * np.empty((3,3))
z_array[y, x] = z

print z_array

哪个收益率：

[[ nan  15.  nan]
 [ 14.  16.  18.]
 [ 17.  nan  13.]]

对于大型数组，这将比坐标上的显式循环快得多。

处理不均匀的x＆amp;输入

如果你经常采样x＆amp; y点，然后您可以通过减去网格的“角”（即x0和y0），除以单元格间距，并将其转换为整数，将它们转换为网格索引。然后，您可以使用上述方法或任何其他答案。

作为一般例子：

i = ((y - y0) / dy).astype(int)
j = ((x - x0) / dx).astype(int)

grid[i,j] = z

但是，如果您的数据没有规则间隔，可以使用一些技巧。

假设我们有以下数据：

import numpy as np
import matplotlib.pyplot as plt

np.random.seed(1977)
x, y, z = np.random.random((3, 10))

fig, ax = plt.subplots()
scat = ax.scatter(x, y, c=z, s=200)
fig.colorbar(scat)
ax.margins(0.05)

enter image description here

我们想要放入常规的10x10网格：

enter image description here

我们实际上可以使用/滥用np.histogram2d。而不是计数，我们将它添加落入单元格的每个点的值。最简单的方法是指定weights=z, normed=False。

import numpy as np
import matplotlib.pyplot as plt

np.random.seed(1977)
x, y, z = np.random.random((3, 10))

# Bin the data onto a 10x10 grid
# Have to reverse x & y due to row-first indexing
zi, yi, xi = np.histogram2d(y, x, bins=(10,10), weights=z, normed=False)
zi = np.ma.masked_equal(zi, 0)

fig, ax = plt.subplots()
ax.pcolormesh(xi, yi, zi, edgecolors='black')
scat = ax.scatter(x, y, c=z, s=200)
fig.colorbar(scat)
ax.margins(0.05)

plt.show()

enter image description here

但是，如果我们有大量的积分，一些垃圾箱会有不止一个积分。 weights的{{1}}参数只需添加值。在这种情况下，这可能不是你想要的。尽管如此，我们可以通过除以计数得到每个单元格中落点的平均值。

所以，例如，假设我们有50分：

np.histogram

enter image description here

如果点数非常多，这个精确的方法会变慢（并且可以很容易地加速），但对于小于1e6点的任何东西都足够了。

Answer 2

您可以尝试以下方式：

import numpy as np

x = [0, 0, 1, 1, 2, 2]
y = [1, 2, 0, 1, 1, 2]
z = [14, 17, 15, 16, 18, 13]

arr = np.zeros((3,3))
yx = zip(y,x)

for i, coord in enumerate(yx):
    arr[coord] = z[i]

print arr
>>> [[  0.  15.   0.]
     [ 14.  16.  18.]
     [ 17.   0.  13.]]

Answer 3

Kezzos打败了我，但我有类似的方法，

x = np.array([0,0,1,1,2,2])
y = np.array([1,2,0,1,1,2])
z = np.array([14,17,15,16,18,13])
Z = np.zeros((3,3))
for i,j in enumerate(zip(x,y)): 
    Z[j] = z[i]

Z[np.where(Z==0)] = np.nan

Answer 4

如果您安装了scipy，则可以利用其sparse矩阵模块。使用genfromtxt从文本文件中获取值，并将这些“列”直接插入sparse矩阵创建器。

In [545]: txt=b"""x y z
0 1 14
0 2 17
1 0 15
1 1 16
2 1 18
2 2 13
"""

In [546]: xyz=np.genfromtxt(txt.splitlines(),names=True,dtype=int)

In [547]: sparse.coo_matrix((xyz['z'],(xyz['y'],xyz['x']))).A     
Out[547]: 
array([[ 0, 15,  0],
       [14, 16, 18],
       [17,  0, 13]])

但乔的z_array=np.zeros((3,3),int); z_array[xyz['y'],xyz['x']]=xyz['z']要快得多。

Answer 5

其他人的好答案。认为这对于可能需要此功能的其他人来说可能是一个有用的片段。

def make_grid(x, y, z):
    '''
    Takes x, y, z values as lists and returns a 2D numpy array
    '''
    dx = abs(np.sort(list(set(x)))[1] - np.sort(list(set(x)))[0])
    dy = abs(np.sort(list(set(y)))[1] - np.sort(list(set(y)))[0])
    i = ((x - min(x)) / dx).astype(int) # Longitudes
    j = ((y - max(y)) / dy).astype(int) # Latitudes
    grid = np.nan * np.empty((len(set(j)),len(set(i))))
    grid[-j, i] = z # if using latitude and longitude (for WGS/West)
    return grid

Python numpy：根据坐标

5 个答案:

处理不均匀的x＆amp;输入