在Theano教程中提供的logistic regression example中,negative_log_likelihood
函数中有一行代码如下:
def negative_log_likelihood(self, y):
"""Return the mean of the negative log-likelihood of the prediction
of this model under a given target distribution.
.. math::
\frac{1}{|\mathcal{D}|} \mathcal{L} (\theta=\{W,b\}, \mathcal{D}) =
\frac{1}{|\mathcal{D}|} \sum_{i=0}^{|\mathcal{D}|} \log(P(Y=y^{(i)}|x^{(i)}, W,b)) \\
\ell (\theta=\{W,b\}, \mathcal{D})
:type y: theano.tensor.TensorType
:param y: corresponds to a vector that gives for each example the
correct label
Note: we use the mean instead of the sum so that
the learning rate is less dependent on the batch size
"""
# y.shape[0] is (symbolically) the number of rows in y, i.e.,
# number of examples (call it n) in the minibatch
# T.arange(y.shape[0]) is a symbolic vector which will contain
# [0,1,2,... n-1] T.log(self.p_y_given_x) is a matrix of
# Log-Probabilities (call it LP) with one row per example and
# one column per class LP[T.arange(y.shape[0]),y] is a vector
# v containing [LP[0,y[0]], LP[1,y[1]], LP[2,y[2]], ...,
# LP[n-1,y[n-1]]] and T.mean(LP[T.arange(y.shape[0]),y]) is
# the mean (across minibatch examples) of the elements in v,
# i.e., the mean log-likelihood across the minibatch.
return -T.mean(T.log(self.p_y_given_x)[T.arange(y.shape[0]), y])
有人可以帮助解释在上面代码的最后一行中使用方括号的确切内容吗? [T.arange(y.shape[0]), y]
将如何解释?
谢谢!
答案 0 :(得分:1)
您可以在函数的注释中获得所需的大部分信息。
T.log(self.p_y_give_x)
返回一个numpy矩阵。
所以[T.arange(y.shape [0]),y]是矩阵的一个切片。这里我们使用numpy高级切片。请参阅:http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html
答案 1 :(得分:0)
我也对这里的矩阵切片感到困惑。 T.arange(y.shape [0])是一个列表.y.shape [0]取决于你设置的迷你批量的大小.y是一个与T.arange具有相同维度的标签列表(y .shape [0])。因此,根据@William Denman的参考,这个切片意味着:对于T.log矩阵中的每一行(self.p_y_give_x),我们选择一个列索引y(其中) y表示黄金标签,在这里也被用作索引。)