在numpy中从2d直方图中检索bin数据

时间:2016-03-12 18:55:23

标签: python numpy

我设法使用numpy.histogram2d()将大约200个点分配到垃圾箱中。 但是,我无法弄清楚的是如何访问每个bin中存储的值。

知道如何去做吗?

3 个答案:

答案 0 :(得分:0)

来自numpy doc

import numpy as np
xedges = [0, 1, 1.5, 3, 5]
yedges = [0, 2, 3, 4, 6]
x = np.random.normal(3, 1, 100)
y = np.random.normal(1, 1, 100)
H, xedges, yedges = np.histogram2d(y, x, bins=(xedges, yedges))

H包含二维直方图值。如果xedges长度为myedges长度为n,则H将具有(m-1, n-1)形状

您还可以指定每个维度的分档数量:

x = np.random.normal(3, 1, 100)
y = np.random.normal(1, 1, 100)
H, xedges, yedges = np.histogram2d(y, x, bins=(5, 6))

H的形状将与您在bins关键字中提供的形状相同:(5, 6)

答案 1 :(得分:0)

我刚试过这个例子in the matplotlib manual

注意hist, xedges, yedges = np.histogram2d(x, y, bins=4)

该方法有三个输出值,其中hist是一个二维数组,其中的值为二进制数;与您传递给imshow以绘制此直方图的投影相同。

答案 2 :(得分:0)

我目前面临着同样的挑战,我还没有在网上或文档中找到任何解决方案。

所以这是我想出的:

# Say you have the following coordinate points:
data = np.array([[-73.589,  45.490],
             [-73.591,  45.497],
             [-73.592,  45.502],
             [-73.574,  45.531],
             [-73.552,  45.534],
             [-73.570,  45.512]])

# These following variables are to determine the range we want for the bins. I use 
# values a bit wider than my max and min values for x and y
extenti = (-73.600, -73.540)
extentj = (45.480, 45.540)

# Run numpy's histogram2d function to return two variables we'll be using 
# later: hist and edges
hist, *edges = np.histogram2d(data[:,0], data[:,1], bins=4, range=(extenti, extentj))

# You can visualize the histogram using matplotlibs's own 2D-histogram:
plt.hist2d(data[:,0], data[:,1], bins=4)

# We'll use numpy's digitize now. According to Numpy's documentarion, numpy.digitize 
# returns the indices of the bins to which each value in input array belongs. However 
# I haven't managed yet to make it work well for the problem we have of 2d histograms. 
# You might manage to, but for now, the following has been working well for me:

# Run np.digitize once along the x axis of our data, and using edges[0].
# edges[0] contains indeed the x axis edges of the numpy.histogram2d we
# made earlier. This will the x-axis indices of bins containing data points. 
hitx = np.digitize(data[:, 0], edges[0])
# Now run it along the y axis, using edges[1]
hity = np.digitize(data[:, 1], edges[1])

# Now we put those togeter.
hitbins = list(zip(hitx, hity))

# And now we can associate our data points with the coordinates of the bin where
# each belongs
data_and_bins = list(zip(data, hitbins))

从那里我们可以通过坐标选择一个 bin 并找到与该 bin 相关联的数据点!

您可以执行以下操作:

[item[0] for item in data_and_bins if item[1] == (1, 2)]

其中 (1, 2) 是要从中检索数据的 bin 的坐标。在我们的例子中,有两个数据点,它们将在上面的行中列出。

请记住我们使用的 np.digitize() 表示越界 0 或 len(bins),这意味着第一个 bin 将具有坐标 (1, 1) 而不是 (0, 0 )

还要记住,如果您和 numpy 就“第一个”垃圾箱是什么达成一致。我相信它从左下角到右上角开始计数。但我可能在那里弄​​错了。

希望这对您或其他遇到此挑战的人有所帮助。