Question

我使用Theano设计了一个神经网络来近似数学多功能。但我，无法近似非线性函数，如：2x / x + 3等。但网络在线性函数上表现良好。

我正在使用1个隐藏层，有2个神经元。我试过增加隐藏层中的神经元。但是，这似乎并没有解决问题。

注意：代码没有任何错误。它也试图降低成本，但它所做的预测并不是预期的（也许它正在学习一些无意义/不需要的模式）。

我正在使用的（更新后的）代码是：

val someTex = ""
val lines = someText.split("\\.").map(_.trim).toList
val firstWords = lines.flatMap(_.split("\\s+").headOption)

与时代相关的成本：

x = T.dscalar()
y = T.dscalar()

inputs = np.random.randint(1,6,size=(500))
outputs = (inputs * 2.0) / (inputs + 3.0)

def layer(x, w):
    b = np.array([1], dtype=theano.config.floatX)
    x = b * x    # doing to be able to concatenate b and x
    x = T.concatenate([x, b])
    return nnet.sigmoid(T.dot(w.T, x))

def grad_desc(cost, theta):
    alpha = 0.01
    return theta - (alpha * (T.grad(cost, wrt=theta)))

theta1 = theano.shared(np.array(np.random.rand(2,6), dtype=theano.config.floatX))
theta2 = theano.shared(np.array(np.random.rand(7,1), dtype=theano.config.floatX))

h1 = layer(x, theta1)
h2 = layer(h1, theta2)
out = T.nnet.softmax(h2)
fc = T.mean(T.sqr(out - y))

back_prop = theano.function(inputs=[x,y], outputs=[fc], updates=[
                (theta1, grad_desc(fc, theta1)),
                (theta2, grad_desc(fc, theta2))
            ])
feed_forward = theano.function(inputs=[x], outputs=[out])

cur_cost = 0
for i in range(100):
    for x, y in zip(inputs, outputs):
        cur_cost = back_prop(x, y)
    if i % 10 == 0:
        print "Epoch ", i/10, " : ", cur_cost

测试：

Epoch  0  :  [array(0.0625)]
Epoch  1  :  [array(0.0625)]
Epoch  2  :  [array(0.0625)]
Epoch  3  :  [array(0.0625)]
Epoch  4  :  [array(0.0625)]
Epoch  5  :  [array(0.0625)]
Epoch  6  :  [array(0.0625)]
Epoch  7  :  [array(0.0625)]
Epoch  8  :  [array(0.0625)]
Epoch  9  :  [array(0.0625)]

测试结果：

test_values = np.random.randint(1,100, size=(1,6))
for i in test_values[0]:
    print "Result : ", feed_forward(i), "Actual : ", (2.0*i)/(i+3.0)

感谢任何帮助。

Answer 1

你的代码不是神经网络......至少不是在有意义的意义上 - 它只是一个线性模型。当你没有任何非线性激活函数时，有一个隐藏层是没有意义的，因为你可以建模的所有东西都只是你输入的线性函数。添加非线性，添加各自的渐变，然后您就可以建模非线性函数。

使用Theano用NN逼近非线性函数

1 个答案: