Question

我正在关注Theano教程（特别是http://deeplearning.net/tutorial/logreg.html）并且我试图理解这一点＆＃34;自我明白＆＃34;代码：

    # y.shape[0] is (symbolically) the number of rows in y, i.e.,
    # number of examples (call it n) in the minibatch
    # T.arange(y.shape[0]) is a symbolic vector which will contain
    # [0,1,2,... n-1] T.log(self.p_y_given_x) is a matrix of
    # Log-Probabilities (call it LP) with one row per example and
    # one column per class LP[T.arange(y.shape[0]),y] is a vector
    # v containing [LP[0,y[0]], LP[1,y[1]], LP[2,y[2]], ...,
    # LP[n-1,y[n-1]]] and T.mean(LP[T.arange(y.shape[0]),y]) is
    # the mean (across minibatch examples) of the elements in v,
    # i.e., the mean log-likelihood across the minibatch.

    return -T.mean(T.log(self.p_y_given_x)[T.arange(y.shape[0]), y])

我的问题是：

函数调用后的[T.arange（y.shape [0]），y]是什么？
如何返回[LP [0，y [0]]，LP [1，y [1]]，LP [2，y [2]]，...，LP [n-1，y [ N-1]]]
T.mean如何知道如何计算类似的东西（无论是什么）？
请你概括一下我不明白的概念吗？

Answer 1

找到答案（NLL代表负对数可能性）：

NLL是一个符号变量;得到NLL的实际值，这个象征性的表达式必须编译成Theano函数（参见Theano 教程了解更多详情） NLL = -T.sum(T.log(p_y_given_x)[T.arange(y.shape[0]), y]) 关于语法的注释：T.arange(y.shape[0])是整数[0,1,2,...,len(y)]的向量。通过两个向量[0,1,...,K], [a,b,...,k]索引矩阵M返回元素M[0,a], M[1,b], ..., M[K,k]作为向量。在这里，我们使用它用于检索正确标签的对数概率的语法，y。

Theano索引细节

1 个答案: