Question

我目前正在通过theano tutorial进行逻辑回归，非常像本文中所讨论的： what-does-negative-log-likelihood-of-logistic-regression-in-theano-look-like。但是，原始tutorial使用共享变量 W和b，以及名为 input 的矩阵。输入为n x n_in矩阵，W为n_in x n_out，b为n_out x 1列向量。

    self.W = theano.shared(
        value=numpy.zeros(
            (n_in, n_out),
            dtype=theano.config.floatX
        ),
        name='W',
        borrow=True
    )

    self.b = theano.shared(
        value=numpy.zeros(
            (n_out,),
            dtype=theano.config.floatX
        ),
        name='b',
        borrow=True
    )

    self.p_y_given_x = T.nnet.softmax(T.dot(input, self.W) + self.b)

现在，据我从共享变量的文档中了解到，共享变量的广播模式默认为false。那么为什么这行代码不会因为尺寸不匹配而抛出错误呢？

self.p_y_given_x = T.nnet.softmax(T.dot(input, self.W) + self.b)

毕竟，我们正在向量T.dot(input, self.W)添加矩阵b。毕竟默认广播共享变量？即使有广播，尺寸也不会增加。 T.dot(input, self.W)是n x n_out矩阵，b是n_out x 1向量。

我错过了什么？

Answer 1

默认情况下，共享变量不能播放，因为它们的形状可能会发生变化，但是，当要求将标量添加到矩阵，向矩阵添加矢量等操作时，Theano会使用张量处理必要的广播。查看documentation了解更多详情。请注意，input是一个符号变量，而不是共享变量。如果self.W的第二维（即 n_out ）与向量b的第一维不同，则会发生尺寸不匹配，这不是这里的情况，因此它们完美相加，在numpy中也是如此：

import numpy as np
a = np.asarray([[1,2,3],[4,5,6],[7,8,9],[10,11,12]])
b = np.asarray([1,2,3])
print(a + b)

使用theano

1 个答案: