Question

我有数据点代表2D数组（矩阵）的坐标。这些点是有规律的网格化，除了某些网格位置缺少数据点。

例如，考虑一些XYZ数据，这些数据适合于具有形状（3,4）的常规0.1网格。有缺口和缺失点，所以有5分，而不是12分：

import numpy as np
X = np.array([0.4, 0.5, 0.4, 0.4, 0.7])
Y = np.array([1.0, 1.0, 1.1, 1.2, 1.2])
Z = np.array([3.3, 2.5, 3.6, 3.8, 1.8])
# Evaluate the regular grid dimension values
Xr = np.linspace(X.min(), X.max(), np.round((X.max() - X.min()) / np.diff(np.unique(X)).min()) + 1)
Yr = np.linspace(Y.min(), Y.max(), np.round((Y.max() - Y.min()) / np.diff(np.unique(Y)).min()) + 1)
print('Xr={0}; Yr={1}'.format(Xr, Yr))
# Xr=[ 0.4  0.5  0.6  0.7]; Yr=[ 1.   1.1  1.2]

我想看到的是这张图片（背景：黑色=基础0指数;灰色=坐标值;颜色=矩阵值;白色=缺失）。

这就是我所拥有的，这对于for循环是直观的：

ar = np.ma.array(np.zeros((len(Yr), len(Xr)), dtype=Z.dtype), mask=True)
for x, y, z in zip(X, Y, Z):
    j = (np.abs(Xr -  x)).argmin()
    i = (np.abs(Yr -  y)).argmin()
    ar[i, j] = z
print(ar)
# [[3.3 2.5 -- --]
#  [3.6 -- -- --]
#  [3.8 -- -- 1.8]]

是否有更多的NumPythonic方法来矢量化返回2D数组ar的方法？或者必须使用for循环吗？

Answer 1

您可以使用np.histogram2d

在一行中执行此操作

data = np.histogram2d(Y, X, bins=[len(Yr),len(Xr)], weights=Z)
print(data[0])
[[ 3.3  2.5  0.   0. ]
 [ 3.6  0.   0.   0. ]
 [ 3.8  0.   0.   1.8]]

Answer 2

您可以使用X和Y在从0.1和min to max of X延伸的min to max of Y间距网格上创建XY坐标，然后插入{{ 1}}进入那些特定的位置。这样可以避免使用Z's获取linspace和Xr，因此必须非常高效。这是实施 -

Yr

运行时测试 -

此部分将def indexing_based(X,Y,Z): # Convert X's and Y's to indices on a 0.1 spaced grid X_int = np.round((X*10)).astype(int) Y_int = np.round((Y*10)).astype(int) X_idx = X_int - X_int.min() Y_idx = Y_int - Y_int.min() # Setup output array and index it with X_idx & Y_idx to set those as Z out = np.zeros((Y_idx.max()+1,X_idx.max()+1)) out[Y_idx,X_idx] = Z return out方法与其他np.histogram2d based solution方法进行比较以确定效果

indexing-based

Answer 3

你可以使用scipy coo_matrix。它允许您根据坐标和数据构造稀疏矩阵。请参阅所附链接上的示例。

http://docs.scipy.org/doc/scipy-0.15.1/reference/generated/scipy.sparse.coo_matrix.html

希望有所帮助。

Answer 4

sparse矩阵是我想到的第一个解决方案，但由于X和Y是浮点数，因此它有点混乱：

In [624]: I=((X-.4)*10).round().astype(int)
In [625]: J=((Y-1)*10).round().astype(int)
In [626]: I,J
Out[626]: (array([0, 1, 0, 0, 3]), array([0, 0, 1, 2, 2]))

In [627]: sparse.coo_matrix((Z,(J,I))).A
Out[627]: 
array([[ 3.3,  2.5,  0. ,  0. ],
       [ 3.6,  0. ,  0. ,  0. ],
       [ 3.8,  0. ,  0. ,  1.8]])

它仍然需要以某种方式将这些坐标与[0,1,2 ...]索引相匹配。我的快速欺骗就是线性地扩展数值。即便如此，在将浮标转换为整数时我也要小心。

sparse.coo_matrix有效，因为定义稀疏矩阵的自然方法是使用(i, j, data)元组，当然可以将其翻译为I，J，{{1}列表或数组。

我更喜欢历史解决方案，即使我还没有机会使用它。

从坐标创建2D Numpy数组

4 个答案: