Question

要清楚，通过权重，我指的是神经网络节点中仿射变换的矩阵（Ws）中的条目。

我从categorical_crossentropy开始作为我的损失函数。我想添加一个额外的术语来惩罚负权重。为此，我想介绍一个形式的术语

theano.tensor.sum(theano.tensor.exp(-10 * ws))

“ws”是权重。

如果我遵循categorical_crossentropy的源代码：

     if true_dist.ndim == coding_dist.ndim: 
         return -tensor.sum(true_dist *tensor.log(coding_dist), axis=coding_dist.ndim - 1)
     elif true_dist.ndim == coding_dist.ndim - 1:
         return crossentropy_categorical_1hot(coding_dist, true_dist)
     else:
         raise TypeError('rank mismatch between coding and true distributions')

似乎我应该更新第三行（从底部）到阅读

crossentropy_categorical_1hot(coding_dist, true_dist) + theano.tensor.sum(theano.tensor.exp(- 10 * ws))

并更改函数的声明 my_categorical_crossentropy(coding_dist, true_dist, ws)在哪里呼吁my_categorical_crossentropy我写

loss = my_categorical_crossentropy(net_output, true_output, l_layers[1].W)

首先，l_layers[1].W是来自我神经网络第一层的权重。

通过这些更新，我继续写下：

 loss = aggregate(loss, mode = 'mean')  
 updates = sgd(loss, all_params, learning_rate = 0.005) 
 train = theano.function([l_input.input_var, true_output], loss, updates = updates)
[...]

这通过编译器，一切运行顺利，网络训练完成。但是，出于某种原因，附加术语“theano.tensor.sum(theano.tensor.exp(- 10 * ws))被忽略，似乎不会影响损失值。

我试图查看Theano文档，但到目前为止我无法弄清楚可能出现的问题？权重l_layers[1].W是共享变量，因此我无法将其作为

传递

train = theano.function([l_input.input_var, true_output, l_layers[1].W], loss, updates = updates)

欢迎任何评论。谢谢！

解决方案

虽然，我没有找到为什么我做了什么，但没有奏效，在评论中建议的'categorical_crossentropy'之外添加惩罚期确实解决了问题：

 loss = aggregate(categorical_crossentropy(net_output, true_output) + theano.tensor.sum(theano.tensor.exp(- 10 * l_layers[1].W))

theano新手。试图在损失函数中添加一个术语来惩罚负权重

解决方案

0 个答案: