二维np.digitize

时间:2015-07-26 08:59:54

标签: python numpy pandas scipy binning

我有二维数据,我有一堆用scipy.stats.binned_statistic_2d生成的二维箱。对于每个数据点,我想要它占据的bin的索引。这正是np.digitize的用途,但据我所知,它只涉及一维数据。 This stackexchange似乎有一个答案,但这完全归结为n维。对于两个维度,是否有更直接的解决方案?

2 个答案:

答案 0 :(得分:5)

您已经可以从scipy.stats.binned_statistic_2d的第四个返回变量获取每个观察的bin索引:

Returns:  
  statistic : (nx, ny) ndarray
      The values of the selected statistic in each two-dimensional bin
  xedges : (nx + 1) ndarray
      The bin edges along the first dimension.
  yedges : (ny + 1) ndarray
      The bin edges along the second dimension.
  binnumber : 1-D ndarray of ints
      This assigns to each observation an integer that represents the bin
      in which this observation falls. Array has the same length as values.

答案 1 :(得分:0)

使用numpy的简单解决方案:

bins = [[0.3, 0.5, 0.7], [0.3, 0.7]]
values = np.random.random((10, 2))
digitized = []
for i in range(len(bins)):
    digitized.append(np.digitize(values[:, i], bins[i], right=False))
digitized = np.concatenate(digitized).reshape(10, 2)