I have a two dimensional data that I want to estimate its joint distribution using kernel density estimation in python. the only problem that I am facing is how to incorporate a lower bound in kernel density estimation in python ( I tried all possibilities (scipy.stat, sklearn.neighbors)). For visualization, seaborn solves the problem by including xlim and ylim. However, I need later to resample from the estimated distribution so the truncation is critical, I need that the sampled values in the y axis are positive. Any hint for that because I did not find an option in python so far? Thank you.
Below I put my code and show some of my outputs: the first one corresponds to seaborn while the second corresponds to gaussian_kde in Scipy.
x = sub_data['x']
y = sub_data['y']
xmin, xmax = 90, 450
ymin, ymax = 0, 2
#[x,y] is the data
# Peform the kernel density estimate using scipy
xx, yy = np.mgrid[xmin:xmax:100j, ymin:ymax:100j]
positions = np.vstack([xx.ravel(), yy.ravel()])
values = np.vstack([x, y])
kernel = st.gaussian_kde(values)
new_samples_spy=kernel.resample(10)
f = np.reshape(kernel(positions).T, xx.shape)
fig = plt.figure()
ax = fig.gca()
ax.set_xlim(xmin, xmax)
ax.set_ylim(ymin, ymax)
# Contourf plot
cfset = ax.contourf(xx, yy, f, cmap='Blues')
# Contour plot
cset = ax.contour(xx, yy, f, colors='k')
# Label plot
ax.clabel(cset, inline=1, fontsize=10)
ax.set_xlabel('x',fontsize=14)
ax.set_ylabel('y', fontsize=14)
plt.legend(loc='upper right')
plt.show()
# # 2nd way using sklearn
from sklearn.neighbors import KernelDensity
def kde2D(x, y, bandwidth, xbins=100j, ybins=100j, **kwargs):
"""Build 2D kernel density estimate (KDE)."""
# # create grid of sample locations (default: 100x100)
xx, yy = np.mgrid[x.min():x.max():xbins, y.min():y.max():ybins]
xy_sample = np.vstack([yy.ravel(), xx.ravel()]).T
xy_train = np.vstack([y, x]).T
kde_skl = KernelDensity(bandwidth=bandwidth, **kwargs)
kde_skl.fit(xy_train)
# score_samples() returns the log-likelihood of the samples
z = np.exp(kde_skl.score_samples(xy_sample))
return xx, yy, np.reshape(z, xx.shape)
fig = plt.figure(2)
ax = fig.gca()
ax.set_xlim(xmin, xmax)
ax.set_ylim(ymin, ymax)
xx, yy, zz = kde2D(x, y, 1.0)
# Contourf plot
cfset = ax.contourf(xx, yy, f, cmap='Blues')
# Contour plot
cset = ax.contour(xx, yy, f, colors='k')
# Label plot
ax.clabel(cset, inline=1, fontsize=10)
ax.set_xlabel('x',fontsize=14)
ax.set_ylabel('y', fontsize=14)
plt.legend(loc='upper right')
plt.show()
# third way
import seaborn as sns
fig = plt.figure(3)
sn=sns.jointplot(x="x", y="y", data=sub_data,kind="kde",xlim=(0,500), ylim=
(0,1.2));
ax.set_xlabel('x',fontsize=14)
ax.set_ylabel('y', fontsize=14)
plt.show()