我正在尝试为多类分类问题定义日志丢失函数:
self.loss = tf.losses.log_loss(
labels=self.sampled_actions,
predictions= [self.probability[i][self.sampled_actions[i]] for i in range(tf.shape(self.sampled_actions)[0])],
weights=self.discounted_rewards)
此处,self.sampled_actions
是0/1/2
的一维张量(例如:[0,1,2,1,0,2]
),其对应于哪个动作是基本事实。 self.probability
定义为:
h = tf.layers.dense(
self.observations,
units=hidden_layer_size,
activation=tf.nn.relu,
kernel_initializer=tf.contrib.layers.xavier_initializer())
self.probability = tf.layers.dense(
h,
units=3,
activation=tf.sigmoid,
kernel_initializer=tf.contrib.layers.xavier_initializer())
作为所有三个动作的概率,对于输入中的任何给定观察,为0,1,2。
但是,当我运行此程序时,我收到错误:
Traceback (most recent call last):
File "spaceinvaders.py", line 68, in <module>
hidden_layer_size, learning_rate, checkpoints_dir='checkpoints')
File "/home/elfarouk/Desktop/opengym/policy_network_space_invaders.py", line 49, in __init__
predictions= [self.probability[i][self.sampled_actions[i]] for i in range(tf.shape(self.sampled_actions)[0])],
TypeError: range() integer end argument expected, got Tensor.
有没有办法指定我在损失函数中的预测应该依赖于sampled_actions?