Question

我已经实现并训练了一个神经网络，其中包含k个二进制输入（0,1）的Theano，一个隐藏层和一个输出层单元。一旦它被训练，我想获得最大化输出的输入（例如，x使得输出层的单位最接近1）。到目前为止，我还没有找到它的实现，所以我尝试了以下方法：

火车网络=＆gt;获得训练有素的权重（theta1，theta2）
用x作为输入定义神经网络函数，并将theta1，theta2训练为固定参数。即：f（x）= sigmoid（theta1 *（sigmoid（theta2 * x）））。该函数采用x并且使用给定的训练权重（theta1，theta2）给出0到1之间的输出。
应用渐变下降w.r.t. x在神经网络函数f（x）上得到x，使得给定的the1和the2最大化f（x）。

对于这些，我已经用玩具示例（k = 2）实现了以下代码。基于http://outlace.com/Beginner-Tutorial-Theano/的教程，但改变了向量y，因此只有一个输入组合给出f（x）~1，即x = [0,1]。

修改1：根据建议，optimizer设置为None，偏差单位固定为1。 第1步：训练神经网络。这运行良好，没有错误。

import os
os.environ["THEANO_FLAGS"] = "optimizer=None"
import theano
import theano.tensor as T
import theano.tensor.nnet as nnet
import numpy as np

x = T.dvector()
y = T.dscalar()

def layer(x, w):
    b = np.array([1], dtype=theano.config.floatX)
    new_x = T.concatenate([x, b])
    m = T.dot(w.T, new_x) #theta1: 3x3 * x: 3x1 = 3x1 ;;; theta2: 1x4 * 4x1
    h = nnet.sigmoid(m)
    return h

def grad_desc(cost, theta):
    alpha = 0.1 #learning rate
    return theta - (alpha * T.grad(cost, wrt=theta))

in_units = 2
hid_units = 3
out_units = 1

theta1 = theano.shared(np.array(np.random.rand(in_units + 1, hid_units), dtype=theano.config.floatX)) # randomly initialize
theta2 = theano.shared(np.array(np.random.rand(hid_units + 1, out_units), dtype=theano.config.floatX))

hid1 = layer(x, theta1) #hidden layer

out1 = T.sum(layer(hid1, theta2)) #output layer
fc = (out1 - y)**2 #cost expression

cost = theano.function(inputs=[x, y], outputs=fc, updates=[
        (theta1, grad_desc(fc, theta1)),
        (theta2, grad_desc(fc, theta2))])
run_forward = theano.function(inputs=[x], outputs=out1)

inputs = np.array([[0,1],[1,0],[1,1],[0,0]]).reshape(4,2) #training data X
exp_y = np.array([1, 0, 0, 0]) #training data Y
cur_cost = 0
for i in range(5000):
    for k in range(len(inputs)):
        cur_cost = cost(inputs[k], exp_y[k]) #call our Theano-compiled cost function, it will auto update weights

print(run_forward([0,1]))

[0,1]的前转输出为：0.968905860574。我们还可以使用theta1.get_value()和theta2.get_value()

第2步：定义神经网络函数f（x）。训练的权重（theta1，theta2）是此函数的常量参数。

由于偏置单元是输入x向量的一部分，因此这里的事情变得有点棘手。为此，我连接b和x。但是代码现在运行良好。

b = np.array([[1]], dtype=theano.config.floatX)
#b_sh = theano.shared(np.array([[1]], dtype=theano.config.floatX))
rand_init = np.random.rand(in_units, 1)
rand_init[0] = 1
x_sh = theano.shared(np.array(rand_init, dtype=theano.config.floatX))
th1 = T.dmatrix()
th2 = T.dmatrix()

nn_hid = T.nnet.sigmoid( T.dot(th1, T.concatenate([x_sh, b])) )
nn_predict = T.sum( T.nnet.sigmoid( T.dot(th2, T.concatenate([nn_hid, b]))))

第3步： 问题现在是梯度下降，因为不限于0和1之间的值。 fc2 =（nn_predict - 1）** 2

cost3 = theano.function(inputs=[th1, th2], outputs=fc2, updates=[
        (x_sh, grad_desc(fc2, x_sh))])
run_forward = theano.function(inputs=[th1, th2], outputs=nn_predict)

cur_cost = 0
for i in range(10000):

cur_cost = cost3(theta1.get_value().T, theta2.get_value().T) #call our Theano-compiled cost function, it will auto update weights
if i % 500 == 0: #only print the cost every 500 epochs/iterations (to save space)
    print('Cost: %s' % (cur_cost,))
    print x_sh.get_value()

最后一次迭代打印：费用：0.000220317356533 [[-0.11492753] [1.99729555]]

此外，输入1变得更负，输入2增加，而最优解是[0,1]。如何解决这个问题？

Answer 1

您正在通过广播规则添加b = [1]而不是连接它。此外，一旦你连接它，你的x_sh有一个维度，这就是为什么错误发生在nn_predict而不是nn_hid

使用theano计算神经网络的最佳输入，使用梯度下降w.r.t.输入

1 个答案: