我正在尝试对D维点数组(D> 1)进行下采样。步骤如下:
想法是从左侧显示的点分布到右侧显示的点分布(此处为2D可视化):
前两点很简单,但是我很难找到一种快速的方法(这很重要)来解决第三点。也就是说:如何从裁剪后的直方图中检索(尽可能快的)与内部有1个元素的D维(裁剪后)垃圾箱关联的边缘坐标?
import numpy as np
N, D = 500, 3
# Random D-dimensional distribution of N points
arr = np.random.uniform(0., 10., (N, D))
# Some random number of bins
bins = np.random.randint(20, 40, D)
# D-dimensional histogram
hst, edges = np.histogramdd(arr, bins=bins)
# Clip at max=1
hst = np.clip(hst, a_min=None, a_max=1)
# This block below needs to be as fast as possible
# Find bins where there is 1 element
idxs = np.array(np.where(hst > 0)).T
# Extract coordinates
hst_dwnsmp = []
for pt in idxs:
coord = []
for i, j in enumerate(pt):
coord.append(edges[i][j])
hst_dwnsmp.append(coord)