从GMM模型可视化拟合的高斯分布

时间:2016-11-30 18:34:05

标签: python matplotlib scikit-learn mixture-model

我试图从高斯混合模型中可视化拟合的高斯分布,似乎无法弄明白。 Herehere我已经看到了可视化一维模型的拟合分布的示例,我不知道如何将其应用于具有3个特征的模型。是否可以将每个训练特征的拟合分布可视化?

我已将模型命名为estimator,并使用X_train

对其进行了培训
estimator = GaussianMixture(covariance_type='full', init_params='kmeans', max_iter=100,
        means_init=array([[ 0.41297,  3.39635,  2.68793],
       [ 0.33418,  3.82157,  4.47384],
       [ 0.29792,  3.98821,  5.78627]]),
        n_components=3, n_init=1, precisions_init=None, random_state=0,
        reg_covar=1e-06, tol=0.001, verbose=0, verbose_interval=10,
        warm_start=False, weights_init=None)

X_train的前5个样本如下:

X_train[:6,:] = array([[  0.29818663,   3.72573161,   4.19829702],
       [  0.24693619,   4.33026266,  10.74416161],
       [  0.21932575,   3.98019433,   8.02464581],
       [  0.24426255,   4.41868353,  10.52576923],
       [  0.16577695,   4.35316706,  12.63638592],
       [  0.28952628,   4.03706551,   8.03804016]])

X_train的形状为(3753L, 3L)。我绘制第一个特征拟合高斯分布的绘图程序如下:

fig, (ax1,ax2,a3) = plt.subplots(nrows=3)
#Domain for pdf
x = np.linspace(0,0.8,3753)
logprob = estimator.score_samples(X_train)
resp = estimator.predict_proba(X_train)
pdf = np.exp(logprob)
pdf_individual = resp * pdf[:, np.newaxis]
ax1.hist(X_train[:,0],30, normed=True, histtype='stepfilled', alpha=0.4)    
ax1.plot(x, pdf, '-k')
ax1.plot(x, pdf_individual, '--k')
ax1.text(0.04, 0.96, "Best-fit Mixture",
        ha='left', va='top', transform=ax.transAxes)
ax1.set_xlabel('$x$')
ax1.set_ylabel('$p(x)$')  
plt.show()    

但这似乎不起作用。关于如何使这项工作的任何想法?

1 个答案:

答案 0 :(得分:0)

如果我加载您的样本数据并适合估算器:

X_train = np.array([[  0.29818663,   3.72573161,   4.19829702],
   [  0.24693619,   4.33026266,  10.74416161],
   [  0.21932575,   3.98019433,   8.02464581],
   [  0.24426255,   4.41868353,  10.52576923],
   [  0.16577695,   4.35316706,  12.63638592],
   [  0.28952628,   4.03706551,   8.03804016]])
estimator.fit(X_train)

一些问题:linspace length不正确,而您正在调用ax.transAxes,但您尚未定义任何ax。这是一个有效的版本:

fig, (ax1,ax2,a3) = plt.subplots(nrows=3)

logprob = estimator.score_samples(X_train)
resp = estimator.predict_proba(X_train)

此处长度应与logprob / pdf one

匹配
#Domain for pdf
x = np.linspace(0,0.8,len(logprob))

pdf = np.exp(logprob)
pdf_individual = resp * pdf[:, np.newaxis]
ax1.hist(X_train[:,0],30, normed=True, histtype='stepfilled', alpha=0.4)    
ax1.plot(x, pdf, '-k')
ax1.plot(x, pdf_individual, '--k')

这里,ax1.transAxes是预期的:

ax1.text(0.04, 0.96, "Best-fit Mixture",
        ha='left', va='top', transform=ax1.transAxes)
ax1.set_xlabel('$x$')
ax1.set_ylabel('$p(x)$')  
plt.show()

Result plot