我想要制作的内容类似于这个情节:
这是一个等值线图,表示两个数据集中包含68%,95%,99.7%的粒子。
到目前为止,我已尝试实施高斯KDE估计,并将这些粒子高斯绘制在轮廓上。
此处添加了文件https://www.dropbox.com/sh/86r9hf61wlzitvy/AABG2mbmmeokIiqXsZ8P76Swa?dl=0
from scipy.stats import gaussian_kde
import matplotlib.pyplot as plt
import numpy as np
# My data
x = RelDist
y = RadVel
# Peform the kernel density estimate
k = gaussian_kde(np.vstack([RelDist, RadVel]))
xi, yi = np.mgrid[x.min():x.max():x.size**0.5*1j,y.min():y.max():y.size**0.5*1j]
zi = k(np.vstack([xi.flatten(), yi.flatten()]))
fig = plt.figure()
ax = fig.gca()
CS = ax.contour(xi, yi, zi.reshape(xi.shape), colors='darkslateblue')
plt.clabel(CS, inline=1, fontsize=10)
ax.set_xlim(20, 800)
ax.set_ylim(-450, 450)
ax.set_xscale('log')
plt.show()
制作:
] 2
其中1)我不知道如何在gaussain kde中控制bin编号,2)轮廓标签都是零,3)我不知道确定百分位数。
感谢任何帮助。
答案 0 :(得分:2)
取自此example in the matplotlib文档
您可以将数据zi转换为百分比刻度(0-1),然后转换为等高线图。
您也可以在调用plt.contour()时手动确定countour图的级别。
以下是2个随机生成的正常双变量分布的示例:
delta = 0.025
x = y = np.arange(-3.0, 3.01, delta)
X, Y = np.meshgrid(x, y)
Z1 = plt.mlab.bivariate_normal(X, Y, 1.0, 1.0, 0.0, 0.0)
Z2 = plt.mlab.bivariate_normal(X, Y, 1.5, 0.5, 1, 1)
Z = 10* (Z1- Z2)
#transform zi to a 0-1 range
Z = Z = (Z - Z.min())/(Z.max() - Z.min())
levels = [0.68, 0.95, 0.997]
origin = 'lower'
CS = plt.contour(X, Y, Z, levels,
colors=('k',),
linewidths=(3,),
origin=origin)
plt.clabel(CS, fmt='%2.3f', colors='b', fontsize=14)
使用您提供的数据,代码也可以正常工作:
from scipy.stats import gaussian_kde
import matplotlib.pyplot as plt
import numpy as np
RadVel = np.loadtxt('RadVel.txt')
RelDist = np.loadtxt('RelDist.txt')
x = RelDist
y = RadVel
k = gaussian_kde(np.vstack([RelDist, RadVel]))
xi, yi = np.mgrid[x.min():x.max():x.size**0.5*1j,y.min():y.max():y.size**0.5*1j]
zi = k(np.vstack([xi.flatten(), yi.flatten()]))
#set zi to 0-1 scale
zi = (zi-zi.min())/(zi.max() - zi.min())
zi =zi.reshape(xi.shape)
#set up plot
origin = 'lower'
levels = [0,0.1,0.25,0.5,0.68, 0.95, 0.975,1]
CS = plt.contour(xi, yi, zi,levels = levels,
colors=('k',),
linewidths=(1,),
origin=origin)
plt.clabel(CS, fmt='%.3f', colors='b', fontsize=8)
plt.gca()
plt.xlim(10,1000)
plt.xscale('log')
plt.ylim(-200,200)