Question

我正在寻找一种可以平滑分散数据集的方法。分散的数据集来自对表示栅格的非常大的数组进行采样。我必须对这个数组进行矢量化处理才能对其进行下采样。通过使用matplotlib.pyplot.contour()函数，我得到了一组合理的点值对。

问题在于此信号很嘈杂，我需要对其进行平滑处理。平滑原始数组是不好的，我需要平滑分散的数据。我能找到的最好的函数是下面的函数，该函数是我从Matlab对应函数重写的。虽然此功能可以完成工作，但速度非常慢。我正在寻找替代功能来平滑此数据，或者正在寻找一种使以下功能更快的方法。

def limgrad(self, triangulation, values, dfdx, imax=100):
    """
    See https://github.com/dengwirda/mesh2d/blob/master/hjac-util/limgrad.m
    for original source code.
    """
    # triangulation is a matplotlib.tri.Triangulation instance
    edge = triangulation.edges
    dx = np.subtract(
        triangulation.x[edge[:, 0]], triangulation.x[edge[:, 1]])
    dy = np.subtract(
        triangulation.y[edge[:, 0]], triangulation.y[edge[:, 1]])
    elen = np.sqrt(dx**2+dy**2)
    aset = np.zeros(values.shape)
    ftol = np.min(values) * np.sqrt(np.finfo(float).eps)
    for i in range(1, imax + 1):
        aidx = np.where(aset == i-1)[0]
        if len(aidx) == 0.:
            break
        active_idxs = np.argsort(values[aidx])
        for active_idx in active_idxs:
            adj_edges_idxs = np.where(
                np.any(edge == active_idx, axis=1))[0]
            adjacent_edges = edge[adj_edges_idxs]
            for nod1, nod2 in adjacent_edges:
                if values[nod1] > values[nod2]:
                    fun1 = values[nod2] + elen[active_idx] * dfdx
                    if values[nod1] > fun1+ftol:
                        values[nod1] = fun1
                        aset[nod1] = i
                else:
                    fun2 = values[nod1] + elen[active_idx] * dfdx
                    if values[nod2] > fun2+ftol:
                        values[nod2] = fun2
                        aset[nod2] = i
    return values

Answer 1

我找到了自己问题的答案，并在此发布以供参考。上面的算法很慢，因为调用np.where（）生成adj_edges_idxs会产生很大的开销。相反，我预先计算了节点邻居，从而消除了开销。它从每秒约80次迭代增加到每秒80,000次。

最终版本如下：

def limgrad(tri, values, dfdx=0.2, imax=100):
    """
    See https://github.com/dengwirda/mesh2d/blob/master/hjac-util/limgrad.m
    for original source code.
    """
    xy = np.vstack([tri.x, tri.y]).T
    edge = tri.edges
    dx = np.subtract(xy[edge[:, 0], 0], xy[edge[:, 1], 0])
    dy = np.subtract(xy[edge[:, 0], 1], xy[edge[:, 1], 1])
    elen = np.sqrt(dx**2+dy**2)
    ffun = values.flatten()
    aset = np.zeros(ffun.shape)
    ftol = np.min(ffun) * np.sqrt(np.finfo(float).eps)
    # precompute neighbor table
    point_neighbors = defaultdict(set)
    for simplex in tri.triangles:
        for i, j in permutations(simplex, 2):
            point_neighbors[i].add(j)
    # iterative smoothing
    for _iter in range(1, imax+1):
        aidx = np.where(aset == _iter-1)[0]
        if len(aidx) == 0.:
            break
        active_idxs = np.argsort(ffun[aidx])
        for active_idx in active_idxs:
            adjacent_edges = point_neighbors[active_idx]
            for adj_edge in adjacent_edges:
                if ffun[adj_edge] > ffun[active_idx]:
                    fun1 = ffun[active_idx] + elen[active_idx] * dfdx
                    if ffun[adj_edge] > fun1+ftol:
                        ffun[adj_edge] = fun1
                        aset[adj_edge] = _iter
                else:
                    fun2 = ffun[adj_edge] + elen[active_idx] * dfdx
                    if ffun[active_idx] > fun2+ftol:
                        ffun[active_idx] = fun2
                        aset[active_idx] = _iter
    flag = _iter < imax
    return ffun, flag

快速平滑分散的数据

1 个答案: