我有一个对象占据下面的N维正方形网格(由numpy数组表示),因此仅25%的网格点被占据。此网格中的每个1x1x1x ... N多维数据集(即超多维数据集)都包含相同数量的此对象(仅位于此超多维数据集的某些顶点处)。我有一个所有占用的网格点的坐标数组。该任务是循环遍历此数组,并提取每个1x1x1 ...超立方体的占用坐标,并将它们存储在新数组中以进行进一步处理。
最好通过示例来说明这种情况。考虑3D情况,其中选择了基础网格,以便1<=i,j,k<=4
给出2d numpy数组:
A
= [
[1 1 1]
[1 2 1]
[1 3 1]
[1 4 1]
[2 1 1]
[2 2 1]
[2 3 1]
[2 4 1]
[3 1 1]
[3 2 1]
[3 3 1]
[3 4 1]
[4 1 1]
[4 2 1]
[4 3 1]
[4 4 1]
[1 1 2]
[1 2 2]
[1 3 2]
[1 4 2]
[2 1 2]
[2 2 2]
[2 3 2]
[2 4 2]
[3 1 2]
[3 2 2]
[3 3 2]
[3 4 2]
[4 1 2]
[4 2 2]
[4 3 2]
[4 4 2]
[1 1 3]
[1 2 3]
[1 3 3]
[1 4 3]
[2 1 3]
[2 2 3]
[2 3 3]
[2 4 3]
[3 1 3]
[3 2 3]
[3 3 3]
[3 4 3]
[4 1 3]
[4 2 3]
[4 3 3]
[4 4 3]
[1 1 4]
[1 2 4]
[1 3 4]
[1 4 4]
[2 1 4]
[2 2 4]
[2 3 4]
[2 4 4]
[3 1 4]
[3 2 4]
[3 3 4]
[3 4 4]
[4 1 4]
[4 2 4]
[4 3 4]
[4 4 4]
]
在这种情况下,我需要处理的2d numpy数组的示例是:
B
= [[1,1,1],[1,2,1],[1,3,1],[1,4,1],[2,2,1],[2, 3,1],[2,4,1],[3,2,1],[3,3,1],[3,4,1],[4,1,1],[4,3, 1],[1,1,2],[1,4,2],[2,1,2],[2,2,2],[2,4,2],[3,1,2] ,[3,2,2],[3,4,2],[4,1,2],[4,2,2],[4,3,2],[4,4,2],[ 1,1,3],[1,4,3],[2,1,3],[2,2,3],[2,3,3],[2,4,3],[3, 1,3],[3,2,3],[3,3,3],[4,1,3],[4,2,3],[4,3,3],[4,4, 3],[1,2,4],[1,3,4],[1,4,4],[2,1,4],[2,2,4],[2,3,4] ,[3,1,4],[3,2,4],[3,3,4],[4,3,4],[4,4,4]]
B
仅是占用的网格点的数组。循环遍历B
,我想获得与基础网格的每个多维数据集有关的占用坐标。立方体边由1<=i,j,k<=2
,(3<=i<=4
)和(1<=j,k<=2
)等定义。例如,对于由1<=i,j,k<=2
划界的立方体,我想提取数组[[1,1,1],[1,1,2],[1,2,1],[2,1,2],[2,2,1],[2,2,2]]来自B
的文件,并将其存储在新数组中以进行进一步处理。然后重复此操作,直到解决了此示例中的所有8个多维数据集。请注意,网格始终选择为均匀的,以便始终可以进行这种类型的分区。尽管我可以选择均匀的网格,但我无法控制对象的站点占用。
Numpy数组切片不起作用,因为通常,B
中的网格点不连续。我尝试使用“ for”循环的代码来强加立方体边缘的范围,但是它变得太复杂了,而且似乎不是正确的方法。然后是“ numpy.where”函数;但是这种情况的复杂性使其很难处理。 “ numpy.extract”也面临类似的挑战。
答案 0 :(得分:0)
如果像这样将网格作为ndim+1
维坐标数组,则
a = np.stack(np.mgrid[1:5, 1:5], -1)
那只是明智地重塑和移调的问题:
import itertools as it
# dimensions for reshape:
# split all coordinate axes into groups of two, leave the
# "dimension subscript" axis alone
chop = *it.chain.from_iterable((i//2, 2) for i in a.shape[:-1]),
# shuffle for transpose:
# regroup axes into coordinates then pairs then subscript
ax_shuf = *(1 ^ np.r_[1:4*a.ndim-3:2] % (2*a.ndim-1)), -1
# put it together and flatten, i.e. join coordinates and join pairs
a.reshape(*chop, -1).transpose(*ax_shuf).reshape(np.prod(chop[::2]), -1, a.shape[-1])
结果:
array([[[1, 1],
[1, 2],
[2, 1],
[2, 2]],
[[1, 3],
[1, 4],
[2, 3],
[2, 4]],
[[3, 1],
[3, 2],
[4, 1],
[4, 2]],
[[3, 3],
[3, 4],
[4, 3],
[4, 4]]])
更新
如果B是整个网格的无序子集,则可以执行以下操作:
注意:如果需要,您可以使用以下Most efficient way to sort an array into bins specified by an index array?中的一种解决方案替换下面代码中的argsort
,从而使速度更快一些。
B = np.array(B)
unq, idx, cts = np.unique((B-B.min(0))//2, axis=0, return_inverse=True, return_counts=True)
if (cts[0]==cts).all():
result = B[idx.argsort()].reshape(len(unq), cts[0], -1)
else:
result = np.split(B[idx.argsort()], cts[:-1].cumsum())
结果:
array([[[1, 1, 1],
[1, 2, 1],
[2, 2, 1],
[2, 2, 2],
[2, 1, 2],
[1, 1, 2]],
[[2, 2, 3],
[2, 1, 3],
[1, 1, 3],
[2, 1, 4],
[2, 2, 4],
[1, 2, 4]],
[[1, 4, 2],
[2, 4, 2],
[1, 3, 1],
[1, 4, 1],
[2, 3, 1],
[2, 4, 1]],
[[1, 4, 4],
[1, 3, 4],
[2, 4, 3],
[2, 3, 3],
[1, 4, 3],
[2, 3, 4]],
[[4, 1, 1],
[4, 1, 2],
[3, 2, 1],
[3, 2, 2],
[4, 2, 2],
[3, 1, 2]],
[[4, 2, 3],
[3, 2, 4],
[3, 1, 3],
[3, 2, 3],
[3, 1, 4],
[4, 1, 3]],
[[4, 4, 2],
[4, 3, 2],
[3, 4, 2],
[4, 3, 1],
[3, 4, 1],
[3, 3, 1]],
[[4, 3, 3],
[3, 3, 3],
[4, 3, 4],
[3, 3, 4],
[4, 4, 3],
[4, 4, 4]]])