我正在关注Theano教程(特别是http://deeplearning.net/tutorial/logreg.html)并且我试图理解这一点"自我明白"代码:
# y.shape[0] is (symbolically) the number of rows in y, i.e.,
# number of examples (call it n) in the minibatch
# T.arange(y.shape[0]) is a symbolic vector which will contain
# [0,1,2,... n-1] T.log(self.p_y_given_x) is a matrix of
# Log-Probabilities (call it LP) with one row per example and
# one column per class LP[T.arange(y.shape[0]),y] is a vector
# v containing [LP[0,y[0]], LP[1,y[1]], LP[2,y[2]], ...,
# LP[n-1,y[n-1]]] and T.mean(LP[T.arange(y.shape[0]),y]) is
# the mean (across minibatch examples) of the elements in v,
# i.e., the mean log-likelihood across the minibatch.
return -T.mean(T.log(self.p_y_given_x)[T.arange(y.shape[0]), y])
我的问题是:
答案 0 :(得分:1)
找到答案(NLL代表负对数可能性):
NLL是一个符号变量;得到NLL的实际值,这个象征性的
表达式必须编译成Theano函数(参见Theano
教程了解更多详情)
NLL = -T.sum(T.log(p_y_given_x)[T.arange(y.shape[0]), y])
关于语法的注释:T.arange(y.shape[0])
是整数[0,1,2,...,len(y)]
的向量。
通过两个向量[0,1,...,K], [a,b,...,k]
索引矩阵M返回
元素M[0,a], M[1,b], ..., M[K,k]
作为向量。在这里,我们使用它
用于检索正确标签的对数概率的语法,y。