Question

我是theano的新手，我仍在努力使用＆＃34;伪代码＆＃34;一方面是theano的风格，另一方面是严格的类型检查。我更像是一个C程序员和一个python程序员。有人可以指出我在这个示例代码中出错的地方，该代码使用预测的y点和x值的训练y点之间的均方误差，以获得线性拟合的最佳斜率和截距吗？

代码如下：

import numpy as np
import theano
import theano.tensor as T
from collections import OrderedDict

class LinearModel:
    def __init__(self,num_points):
        self.m = theano.shared(value=0.1,name='m')
        self.b = theano.shared(value=1, name='b')
        self.params = [self.m, self.b]

        def step(x_t):
            y_t = self.m * x_t + self.b
            return y_t

        self.x = T.matrix('x',dtype=theano.config.floatX)
        self.y, _ = theano.scan(
                        fn=step,
                        sequences=self.x,
                    ) 

        self.loss = lambda y_train: self.mse(y_train)

    def mse(self, y_train):
        return T.mean((self.y - y_train) ** 2)

    def fit(self,x, y, learning_rate=0.01, num_iter=100):
        trainset_x = theano.tensor._shared(x.astype(np.dtype(np.float32)),borrow=True)
        trainset_y = theano.tensor._shared(y.astype(np.dtype(np.float32)),borrow=True)
        n_train = trainset_x.get_value(borrow=True).shape[0]

        cost = self.loss(trainset_y)
        gparams = T.grad(cost,self.params)

        l_r = T.scalar('l_r', dtype=theano.config.floatX)

        updates = OrderedDict()
        for param,gparam in zip(self.params,gparams):
            updates[param] = param - l_r * gparam

        self.train_model = theano.function(  inputs=[l_r],
                                        outputs=[cost,self.y],
                                        updates=updates,
                                        givens={
                                              self.x: trainset_x,
                                            }
                                        )

        epoch = 0
        while epoch < num_iter:
            cost, _ = self.train_model(learning_rate)
            m = self.m.get_value()
            b = self.b.get_value()
            print "epoch: ",epoch," cost: ",cost," m: ",m," b: ",b


if __name__ == '__main__':
    lin = LinearModel(10)
    x = np.arange(10)
    y = np.random.rand(10)
    lin.fit(x,y,learning_rate=0.01,num_iter=100)

错误是：

Traceback（最近一次调用最后一次）：文件   ＆＃34;〜/ EclipseWorkspace / MemoryNetworkQA.Theano / linear_regression.py＆＃34;，line   70，在       lin.fit（x，y，learning_rate = 0.01，num_iter = 100）File＆＃34;〜/ EclipseWorkspace / MemoryNetworkQA.Theano / linear_regression.py＆＃34;，line   54，合适       self.x：trainset_x，File＆＃34; /usr/local/lib/python2.7/dist-packages/theano/compile/function.py"，   第266行，在功能上       profile = profile）文件＆＃34; /usr/local/lib/python2.7/dist-packages/theano/compile/pfunc.py" ;, line   489，在pfunc       no_default_updates = no_default_updates）文件＆＃34; /usr/local/lib/python2.7/dist-packages/theano/compile/pfunc.py" ;, line   217，在rebuild_collect_shared中       引发TypeError（err_msg，err_sug）

TypeError :(＆＃39;更新必须与原始共享具有相同的类型   变量（shared_var = b，shared_var.type = TensorType（int64，scalar），   update_val = Elemwise {子，no_inplace} 0.0，   update_val.type = TensorType（float64，标量））。＆＃39;，＆＃39;如果差别是   与广播模式有关，你可以打电话给   tensor.unbroadcast（var，axis_to_unbroadcast [，...]）要删除的功能   可广播的尺寸。＆＃39;）

Answer 1

在解决以下问题之前，此代码不会执行。

问题中报告的错误是由于self.b的类型与self.b的更新类型不匹配。 self.b没有指定类型，因此推断了一个类型。初始值是Python整数，因此推断类型为int64。更新为floatX，因为学习率为floatX。您无法使用int64更新floatX。解决方案是使初始值为Python float，从而产生推断的floatX类型。将self.b = theano.shared(value=1, name='b')更改为self.b = theano.shared(value=1., name='b')（请注意1后的小数点。）
下一个问题是self.x被定义为矩阵，但在最后一行的函数调用中传递的值是一个向量。一种解决方案是将x重新整形为矩阵，例如将x = np.arange(10)更改为x = np.arange(10).reshape(1,10)。
trainset共享变量的类型为float32，但这与使用floatX的代码的其他区域冲突。如果您的floatX=float32那么应该没有问题，但仅仅使用floatX来保持相同的浮点类型会更安全。将trainset_x = theano.tensor._shared(x.astype(np.dtype(np.float32)),borrow=True)更改为trainset_x = theano.tensor._shared(x.astype(theano.config.floatX),borrow=True)，同样更改trainset_y。
由于epoch没有增加，因此时期数目当前没有任何效果。将while epoch < num_iter:更改为for epoch in xrange(num_iter):并删除epoch = 0。

此外，

参数看起来像是没有更新，但这是一个错误的视图。由于上面的问题4，迭代快速通过并且永不停止，并且学习速率足够大以使模型非常快速地收敛。尝试将学习率更改为更小的值，例如0.0001，并查看前100个时期的输出。
我建议避免使用theano.tensor._shared，除非您确实需要强制在device=gpu时在CPU上分配共享变量。首选方法是theano.shared。
n_train变量不会在任何地方使用。
您不一致地使用givens。我建议将其用于x和y，或两者都不使用。请查看logistic regression tutorial以获取更多相关信息。
每次调用fit时都会重新编译Theano函数，但您最好只编译一次并在每个fit上重复使用。
< / LI>
可以在不使用scan的情况下实施此模型。通常，只有当步骤的输出是前一步骤的输出的函数时，才需要scan。 scan通常比替代方案慢得多，应尽可能避免。您可以改为使用scan删除self.y = self.m * self.x + self.b。
如果您确实使用了扫描，那么通过strict=True来电中的scan启用严格模式是一种很好的做法。
明确提供所有共享变量的类型是一种很好的做法。您对trainset_x和trainset_y执行此操作，但不对self.m和self.b执行此操作。

Answer 2

好的，我发现问题确实存在于self.b中。用显式浮点数初始化后，类型错误就消失了。

但斜率和截距（self.m和self.b）仍然是theano共享变量并且正在通过更新传递，并没有真正得到更新。如果有人能告诉我原因，那将是一个很大的帮助。感谢。

import numpy as np
import theano
import theano.tensor as T
from collections import OrderedDict

class LinearModel:
    def __init__(self,num_points):
        self.m = theano.shared(value=0.1,name='m')
        self.b = theano.shared(value=1.0, name='b')
        self.params = [self.m, self.b]

        def step(x_t):
            y_t = self.m * x_t + self.b
            return y_t

        #self.x = T.matrix('x',dtype=theano.config.floatX)
        #self.x = T.dmatrix('x')
        self.x = T.vector('x',dtype=theano.config.floatX)
        self.y, _ = theano.scan(
                        fn=step,
                        sequences=self.x,
                    ) 

        self.loss = lambda y_train: self.mse(y_train)

    def mse(self, y_train):
        return T.mean((self.y - y_train) ** 2)

    def fit(self,x, y, learning_rate=0.01, num_iter=100):
        trainset_x = theano.tensor._shared(x.astype(np.dtype(np.float32)),borrow=True)
        trainset_y = theano.tensor._shared(y.astype(np.dtype(np.float32)),borrow=True)
        n_train = trainset_x.get_value(borrow=True).shape[0]

        cost = self.loss(trainset_y)
        gparams = T.grad(cost,self.params)

        l_r = T.scalar('l_r', dtype=theano.config.floatX)

        updates = OrderedDict()
        for param,gparam in zip(self.params,gparams):
            updates[param] = param - l_r * gparam

        self.train_model = theano.function(  inputs=[l_r],
                                        outputs=[cost,self.y],
                                        updates=updates,
                                        givens={
                                              self.x: trainset_x,
                                            }
                                        )


        epoch = 0
        while epoch < num_iter:
            cost, _ = self.train_model(learning_rate)
            m = self.m.get_value()
            b = self.b.get_value()
            print "epoch: ",epoch," cost: ",cost," m: ",m," b: ",b
            epoch += 1


if __name__ == '__main__':
    lin = LinearModel(10)
    x = np.array([1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0])
    y = np.random.rand(10)
    lin.fit(x,y,learning_rate=0.01,num_iter=100)

当我将一个numpy数组发送到theano函数中的givens参数时，为什么我会得到这个Theano TypeError

2 个答案: