numpy在坐标网格中排列和子采样3d点

时间:2019-03-29 01:46:29

标签: python arrays numpy linear-algebra data-science

我有3d点的列表,例如

np.array([
    [220, 114, 2000],
    [125.24, 214, 2519],
    ...
    [54.1, 254, 1249]
])

这些点没有有意义的顺序。我想对数组进行排序和重塑,使其更好地表示坐标网格(这样,我就知道了宽度和高度,并可以通过索引检索Z值)。我还想将采样点采样为整数以处理碰撞。在向下采样期间应用最小值,最大值或平均值。

我知道我可以使用np.meannp.shape对一维数组进行下采样

我目前使用的方法是在X,Y中找到最小值和最大值,然后将Z值放入2d数组中,同时手动进行下采样。<​​/ p>

这会多次遍历巨型数组,我想知道是否有一种方法可以使用np.meshgrid或我忽略的其他一些numpy功能。

谢谢

1 个答案:

答案 0 :(得分:0)

您可以使用Most efficient way to sort an array into bins specified by an index array?中的合并方法 要从y,x坐标中获取索引数组,可以使用np.searchsortednp.ravel_multi_index

这是一个示例实现,stb模块是链接文章中的代码。

import numpy as np
from stb import sort_to_bins_sparse as sort_to_bins

def grid1D(u, N):
    mn, mx = u.min(), u.max()
    return np.linspace(mn, mx, N, endpoint=False)

def gridify(yxz, N):
    try:
        Ny, Nx = N
    except TypeError:
        Ny = Nx = N
    y, x, z = yxz.T
    yg, xg = grid1D(y, Ny), grid1D(x, Nx)
    yidx, xidx = yg.searchsorted(y, 'right')-1, xg.searchsorted(x, 'right')-1
    yx = np.ravel_multi_index((yidx, xidx), (Ny, Nx))
    zs = sort_to_bins(yx, z)
    return np.concatenate([[0], np.bincount(yx).cumsum()]), zs, yg, xg

def bin(yxz, N, binning_method='min'):
    boundaries, binned, yg, xg = gridify(yxz, N)
    result = np.full((yg.size, xg.size), np.nan)
    if binning_method == 'min':
        result.reshape(-1)[:len(boundaries)-1] = np.minimum.reduceat(binned, boundaries[:-1])
    elif binning_method == 'max':
        result.reshape(-1)[:len(boundaries)-1] = np.maximum.reduceat(binned, boundaries[:-1])
    elif binning_method == 'mean':
        result.reshape(-1)[:len(boundaries)-1] = np.add.reduceat(binned, boundaries[:-1]) / np.diff(boundaries)
    else:
        raise ValueError
    result.reshape(-1)[np.where(boundaries[1:] == boundaries[:-1])] = np.nan
    return result

def test():
    yxz = np.random.uniform(0, 100, (100000, 3))
    N = 20
    boundaries, binned, yg, xg = gridify(yxz, N)
    binmin = bin(yxz, N)
    binmean = bin(yxz, N, 'mean')
    y, x, z = yxz.T
    for i in range(N-1):
        for j in range(N-1):
            msk = (y>=yg[i]) & (y<yg[i+1]) & (x>=xg[j]) & (x<xg[j+1])
            assert (z[msk].min() == binmin[i, j]) if msk.any() else np.isnan(binmin[i, j])
            assert np.isclose(z[msk].mean(), binmean[i, j]) if msk.any() else np.isnan(binmean[i, j])