受限制的Boltzmann机器:如何预测类别标签?

时间:2014-05-02 00:59:53

标签: scikit-learn

所以我正在阅读SKLearn网站上限制性玻尔兹曼机器的example,在得到这个例子后,我想与BernoulliRBM一起玩更多,以更好地了解RBM的工作原理。我尝试做一些简单的类预测:

# Adapted from sample digits recognition client on Scikit-Learn site.

import numpy as np
from sklearn import linear_model, datasets
from sklearn.cross_validation import train_test_split
from sklearn.neural_network import BernoulliRBM
from sklearn.pipeline import Pipeline
from sklearn.lda import LDA

# import some data to play with
iris = datasets.load_iris()
X = iris.data[:, :2]  # we only take the first two features.
Y = iris.target
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.2,     random_state=10)

# Models we will use
rbm = BernoulliRBM(random_state=0, verbose=True)
logistic = linear_model.LogisticRegression()
classifier = Pipeline(steps=[('rbm', rbm), ('logistic', logistic)])
lda = LDA(n_components=3)

#########################################################################

# Training RBM-Logistic Pipeline
logistic.fit(X_train, Y_train)
classifier.fit(X_train, Y_train)

#########################################################################

# Get predictions
print "The RBM model:"
print "Predict: ", classifier.predict(X_test)
print "Real:    ", Y_test

print

print "Linear Discriminant Analysis: "
lda.fit(X_train, Y_train)
print "Predict: ", lda.predict(X_test)
print "Real:    ", Y_test    

这是输出:

Iteration 0, pseudo-likelihood = 0.00, time = 0.02s
Iteration 1, pseudo-likelihood = 0.00, time = 0.02s
Iteration 2, pseudo-likelihood = 0.00, time = 0.02s
Iteration 3, pseudo-likelihood = 0.00, time = 0.02s
Iteration 4, pseudo-likelihood = 0.00, time = 0.02s
Iteration 5, pseudo-likelihood = 0.00, time = 0.02s
Iteration 6, pseudo-likelihood = 0.00, time = 0.02s
Iteration 7, pseudo-likelihood = 0.00, time = 0.01s
Iteration 8, pseudo-likelihood = 0.00, time = 0.01s
Iteration 9, pseudo-likelihood = 0.00, time = 0.02s
The RBM model:
Predict:  [2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2]
Real:     [1 2 0 1 0 1 1 1 0 1 1 2 1 0 0 2 1 0 0 0 2 2 2 0 1 0 1 1 1 2]

Linear Discriminant Analysis:
Predict:  [2 2 0 1 0 1 2 1 0 1 1 1 1 0 0 2 1 0 0 0 2 2 2 0 1 0 1 1 2 2]
Real:     [1 2 0 1 0 1 1 1 0 1 1 2 1 0 0 2 1 0 0 0 2 2 2 0 1 0 1 1 1 2]

为什么RBM预测" 2"对于测试数据中的每个标签,即使它显然不正确(如LDA所示)?怎么可能让Pipeline(rbm,logistic)预测类标签?如果你能向神经网络新手解释这一点,我真的很感激。

0 个答案:

没有答案