Question

我正在尝试实现RNN并具有形状的输出预测p_y（batch_size，time_points，num_classes）。我还有一个形状的target_output（batch_size，time_points），其中target_output的给定索引处的值是表示类的整数（介于0和num_classes-1之间的值）。如何使用target_output索引p_y以获取计算交叉熵所需的给定类的概率？

我甚至不确定如何在numpy中做到这一点。表达式p_y [target_output]没有给出期望的结果。

Answer 1

您需要使用高级索引（搜索“高级索引”here）。但Theano高级索引的表现与numpy不同，因此知道如何在numpy中执行此操作可能不是那么有用！

这是一个为我的设置执行此操作的功能，但请注意我的尺寸顺序与您的不同。我使用（time points，batch_size，num_classes）。这也假设您想要使用1-of-N分类交叉熵变体。您可能也不想要序列长度填充。

def categorical_crossentropy_3d(coding_dist, true_dist, lengths):
    # Zero out the false probabilities and sum the remaining true probabilities to remove the third dimension.
    indexes = theano.tensor.arange(coding_dist.shape[2])
    mask = theano.tensor.neq(indexes, true_dist.reshape((true_dist.shape[0], true_dist.shape[1], 1)))
    predicted_probabilities = theano.tensor.set_subtensor(coding_dist[theano.tensor.nonzero(mask)], 0.).sum(axis=2)

    # Pad short sequences with 1's (the pad locations are implicitly correct!)
    indexes = theano.tensor.arange(predicted_probabilities.shape[0]).reshape((predicted_probabilities.shape[0], 1))
    mask = indexes >= lengths
    predicted_probabilities = theano.tensor.set_subtensor(predicted_probabilities[theano.tensor.nonzero(mask)], 1.)

    return -theano.tensor.log(predicted_probabilities)

使用Theano进行批量交叉熵

1 个答案: