如何" bin"使用自定义(非线性间隔)存储桶的numpy数组?

时间:2017-09-14 10:29:27

标签: numpy scipy binning

如何" bin" numpy中的波纹管阵列,以便:

import numpy as np
bins = np.array([-0.1 , -0.07, -0.02,  0.  ,  0.02,  0.07,  0.1 ])
array = np.array([-0.21950869, -0.02854823,  0.22329239, -0.28073936, -0.15926265,
              -0.43688216,  0.03600587, -0.05101109, -0.24318651, -0.06727875])

即用以下内容替换values中的每个array

-0.1 where `value` < -0.085
-0.07 where -0.085 <= `value` < -0.045
-0.02 where -0.045 <= `value` < -0.01
0.0 where -0.01 <= `value` < 0.01
0.02 where 0.01 <= `value` < 0.045
0.07 where 0.045 <= `value` < 0.085
0.1 where `value` >= 0.085

预期输出为:

array = np.array([-0.1, -0.02,  0.1, -0.1, -0.1, -0.1,  0.02, -0.07, -0.1, -0.07])

我认识到numpy有digitize函数,但它返回bin的索引而不是bin本身。那就是:

np.digitize(array, bins)
np.array([0, 2, 7, 0, 0, 0, 5, 2, 0, 2])

1 个答案:

答案 0 :(得分:1)

通过成对连续的bin值平均得到那些中间值。然后,使用np.searchsortednp.digitize使用中间值获取索引。最后,为输出索引bins

中值:

mid_bins = (bins[1:] + bins[:-1])/2.0

searchsorteddigitze的指数:

idx = np.searchsorted(mid_bins, array)
idx = np.digitize(array, mid_bins)

输出:

out = bins[idx]