scikit-learn:如何使用拟合概率模型?

时间:2015-09-06 02:10:06

标签: python scikit-learn

所以我使用scikit-learn import numpy as np import matplotlib.pyplot as plt from matplotlib.colors import LogNorm from sklearn import mixture import matplotlib as mpl from matplotlib.patches import Ellipse %matplotlib inline n_samples = 300 # generate random sample, two components np.random.seed(0) shifted_gaussian = np.random.randn(n_samples, 2) + np.array([20, 5]) sample= shifted_gaussian # fit a Gaussian Mixture Model with two components clf = mixture.GMM(n_components=2, covariance_type='full') clf.fit(sample) # plot sample scatter plt.scatter(sample[:, 0], sample[:, 1]) # 1. Plot the probobility density distribution # 2. Calculate the mean square error of the fitting model http://scikit-learn.org/stable/modules/mixture.html)来拟合我的数据,现在我想使用该模型,我该怎么做?具体做法是:

  1. 如何绘制概率密度分布?
  2. 如何计算拟合模型的均方误差?
  3. 以下是您可能需要的代码:

    x = np.linspace(-20.0, 30.0)
    y = np.linspace(-20.0, 40.0)
    X, Y = np.meshgrid(x, y)
    XX = np.array([X.ravel(), Y.ravel()]).T
    Z = -clf.score_samples(XX)[0]
    Z = Z.reshape(X.shape)
    
    CS = plt.contour(X, Y, Z, norm=LogNorm(vmin=1.0, vmax=1000.0),
                     levels=np.logspace(0, 3, 10))
    CB = plt.colorbar(CS, shrink=0.8, extend='both')
    

    更新: 我可以通过以下方式绘制分布图:

    {{1}}

    但这不是很奇怪吗?有没有更好的方法呢?我可以画这样的东西吗? enter image description here

1 个答案:

答案 0 :(得分:2)

我认为结果是合理的,如果你稍微调整xlim和ylim:

# plot sample scatter
plt.scatter(sample[:, 0], sample[:, 1], marker='+', alpha=0.5)

# 1. Plot the probobility density distribution
# 2. Calculate the mean square error of the fitting model
x = np.linspace(-20.0, 30.0, 100)
y = np.linspace(-20.0, 40.0, 100)
X, Y = np.meshgrid(x, y)
XX = np.array([X.ravel(), Y.ravel()]).T
Z = -clf.score_samples(XX)[0]
Z = Z.reshape(X.shape)

CS = plt.contour(X, Y, Z, norm=LogNorm(vmin=1.0, vmax=10.0),
                 levels=np.logspace(0, 1, 10))
CB = plt.colorbar(CS, shrink=0.8, extend='both')
plt.xlim((10,30))
plt.ylim((-5, 15))

enter image description here