我正在尝试在tensorflow中实现多类日志丢失功能。以下是我提出的建议:
#self.sampled_actions: Tensor [batchsize, 2] [[0,1], [1,0], .......]
# which is one hot encoded
#self.probability: Tensor [batchsize, 2] [[.4,.6], [.3,.7], .....]
# Represents probability of each class by neural net output
#self.discounted_rewards Tensor [batchsize, 1] [0.4 0.5 0.5 -0.1, .....]
# Differenc weights for different data points
self.batch_loss = tf.log(tf.reduce_sum(self.sampled_actions * self.probability, axis=1))*self.discounted_rewards
self.loss = -tf.reduce_sum( self.batch_loss , axis=0)
然而,我的损失函数最终是一个向量而不是单个值。我出错的地方?
答案 0 :(得分:0)
删除轴= 0。 请从正式的TensorFlow站点查看reduce_sum的一些示例:
x = tf.constant([[1, 1, 1], [1, 1, 1]])
tf.reduce_sum(x) # 6
tf.reduce_sum(x, 0) # [2, 2, 2]
tf.reduce_sum(x, 1) # [3, 3]
tf.reduce_sum(x, 1, keepdims=True) # [[3], [3]]
tf.reduce_sum(x, [0, 1]) # 6