Question

我有一个数据集，其中x值包含在数组a中，相应的y值包含在数组b中。我正在绘制这个数据集的散点图，并使用scipy的gaussian_kde根据它们的密度对点进行着色。代码看起来像：

    xy = np.vstack([x,y])
    z = gaussian_kde(xy)(xy)

    # Sort the points by density, so that the densest points are plotted last
    idx = z.argsort()
    x, y, z = x[idx], y[idx], z[idx]

    fig, ax = plt.subplots()
    ax.scatter(x, y, c=z, s=50, edgecolor='')
    plt.show()

现在，我有另一个数据集（包含在数组c和d中，分别对应于数据集2的x和y值）。我想制作FIRST数据集的散点图，但这次我想通过FIRST数据集中的点的空间密度与第二个数据集中的点的空间密度的比率来对代码进行颜色编码。，以便我可以看到第一个数据集中的对象相对更普遍。有没有人对如何解决这个问题有任何建议？

Answer 1

计算第二个数据集的空间密度。然后计算两个密度的比率。按比例和图表排序。

xy = np.vstack([x,y])
z = gaussian_kde(xy)(xy)

# spacial density of points in second data set
cd = np.vstack([c,d])
e = gaussian_kde(cd)(cd)

# ratio of density in the two data sets
r = z / e

# Sort the points by density ratio, so that the biggest ratios are plotted last
idx = r.argsort()
x, y, r = x[idx], y[idx], r[idx]

fig, ax = plt.subplots()
ax.scatter(x, y, c=r, s=50, edgecolor='')
plt.show()

编辑：对不起，我错过了评论说两个数据集中的坐标不对应。尝试使用scipy.interpolate.interp2d()创建一个近似第二个数据集空间密度的函数。然后使用该函数估计第一个数据集坐标处的密度。

from scipy import interpolate

# calculate spatial densities for the two data sets as above

f = interpolate.interp2d(c, d, e, kind='cubic')
r = z / f(x,y)

现在对比率等进行排序

Matplotlib：基于相对密度的着色散点图

1 个答案: