Theano在线性回归中使用`scan`而不是`for`循环

时间:2016-10-25 18:20:24

标签: python machine-learning theano linear-regression theano.scan

我试图更好地掌握scan中的theano功能,我的理解是它的行为就像基于documentfor循环一样}。我创建了一个非常简单的工作示例来查找执行线性回归时的权重和偏差。

#### Libraries
# Third Party Libraries
import numpy as np
import theano
import theano.tensor as T

# not intended for mini-batch
def gen_data(num_points=50, slope=1, bias=10, x_max=50):
    f = lambda z: slope * z + bias
    x = np.zeros(shape=(num_points), dtype=theano.config.floatX)
    y = np.zeros(shape=(num_points), dtype=theano.config.floatX)

    for i in range(num_points):
        x_temp = np.random.uniform()*x_max
        x[i] = x_temp
        y[i] = f(x_temp) + np.random.normal(scale=3.0)

    return (x, y)

#############################################################
#############################################################
train_x, train_y = gen_data(num_points=50, slope=2, bias=5)
epochs = 50

# Declaring variable
learn_rate = T.scalar(name='learn_rate', dtype=theano.config.floatX)
x = T.vector(name='x', dtype=theano.config.floatX)
y = T.vector(name='y', dtype=theano.config.floatX)
# Variables that will be updated
theta = theano.shared(np.random.rand(), name='theta')
bias = theano.shared(np.random.rand(), name='bias')

hyp = T.dot(theta, x) + bias
cost = T.mean((hyp - y)**2)/2
f_cost = theano.function(inputs=[x, y], outputs=cost)

grad_t, grad_b = T.grad(cost, [theta, bias])

train = theano.function(inputs=[x, y, learn_rate], outputs=cost,
                        updates=((theta, theta-learn_rate*grad_t), 
                                 (bias, bias-learn_rate*grad_b)))

print('weight: {}, bias: {}'.format(theta.get_value(), bias.get_value()))

for i in range(epochs): # Try changing this to a `scan`
    train(train_x, train_y, 0.001)

print('------------------------------')
print('weight: {}, bias: {}'.format(theta.get_value(), bias.get_value()))

我想将for循环更改为theano.scan函数,但我所做的每一次尝试都会在下一次尝试后产生一条错误消息。

1 个答案:

答案 0 :(得分:1)

为了使用theano.scan我从OrderedDict导入collection来用于共享变量。使用dict将导致以下错误消息:

Expected OrderedDict or OrderedUpdates, got <class 'dict'>. This can make your script non-deterministic.

其次,我定义了一个函数,其中要计算损耗和梯度。该函数返回lossOrderedDict()。功能

def cost(inputs, outputs, learn_rate, theta, bias):
    hyp = T.dot(theta, inputs) + bias
    loss = T.mean((hyp - outputs)**2)/2

    grad_t, grad_b = T.grad(loss, [theta, bias])

    return loss, OrderedDict([(theta, theta-learn_rate*grad_t),
                              (bias, bias-learn_rate*grad_b)])

然后定义theano.scan()

results, updates = theano.scan(fn=cost,
                               non_sequences=[x, y, learn_rate, theta, bias],
                               n_steps=epochs)

我选择将xy包括为non_sequences,因为这个玩具示例的尺寸相对较小,因为它比传递{{1}快两倍}}。

最后,sequences是使用theano.function()

中的results, updates定义的
theano.scan()

把这一切都放在我们身上:

train = theano.function(inputs=[x, y, learn_rate, epochs], outputs=results,
                        updates=updates)

为了完整性,我已将代码传递给#### Libraries # Standard Libraries from collections import OrderedDict # Third Party Libraries # import matplotlib.pyplot as plt import numpy as np # from sklearn import linear_model import theano import theano.tensor as T # def gen_data(num_points=50, slope=1, bias=10, x_max=50): # pass # Use the code in the above post to generate sample points ######################################################################## # Generate Data train_x, train_y = gen_data(num_points=50, slope=2) # Declaring variable x = T.vector(name='x', dtype=theano.config.floatX) y = T.vector(name='y', dtype=theano.config.floatX) learn_rate = T.scalar(name='learn_rate', dtype=theano.config.floatX) epochs = T.iscalar(name='epochs') # Variables that will be updated, hence are declared as `theano.share` theta = theano.shared(np.random.rand(), name='theta') bias = theano.shared(np.random.rand(), name='bias') def cost(inputs, outputs, learn_rate, theta, bias): hyp = T.dot(theta, inputs) + bias loss = T.mean((hyp - outputs)**2)/2 grad_t, grad_b = T.grad(loss, [theta, bias]) return loss, OrderedDict([(theta, theta-learn_rate*grad_t), (bias, bias-learn_rate*grad_b)]) results, updates = theano.scan(fn=cost, non_sequences=[x, y, learn_rate, theta, bias], n_steps=epochs) # results, updates = theano.scan(fn=cost, # sequences=[x, y], # non_sequences = [learn_rate, theta, bias], # n_steps=epochs) train = theano.function(inputs=[x, y, learn_rate, epochs], outputs=results, updates=updates) print('weight: {}, bias: {}'.format(theta.get_value(), bias.get_value())) train(train_x, train_y, 0.001, 30) print('------------------------------') print('weight: {}, bias: {}'.format(theta.get_value(), bias.get_value())) x作为y。只需取消注释该部分代码,然后 AND 注释掉sequences的其他实例。