keras功能中不兼容的形状

时间:2019-03-26 20:44:52

标签: python tensorflow keras reinforcement-learning

我正在尝试使用keras(和tensorflow 2.0 alpha,关闭急切执行)来实现一个actor-critic网络,但是在更新actor网络权重的keras函数中似乎有一个错误。

我将我的print()语句留在代码中,以显示我已研究的内容,并希望弥补该代码不完整,因此无法再现的事实。 编辑:这是我的演员和评论家模型

enter image description here

该函数的调用方式如下,我将输出所有输入数组的形状:

# Networks optimization
print('Shapes of vars: states: {}, actions: {}, advantages: {}'.format(
    np.array(states).shape, np.array(actions).shape, np.array(advantages).shape))
self.a_opt([states, actions, advantages]) # call the keras function written out below
# a print statement here is never reached

被调用的函数(输出“不兼容的形状”错误)如下所示:

def a_opt(self):
    """ Actor Optimization: Advantages + Entropy term to encourage exploration
    (Cf. https://arxiv.org/abs/1602.01783)
    """
    modelout = K.print_tensor(
        self.model.output, message="model output: " + str(K.int_shape(self.model.output)))
    action_pl = K.print_tensor(
        self.action_pl, message="action_pl: " + str(K.int_shape(self.action_pl)))

    weighted_actions = K.sum(action_pl * modelout, axis=1)
    weighted_actions = K.print_tensor(
        weighted_actions, message="weighted_actions: ")

    eligibility = K.log(weighted_actions + 1e-10) * \
        K.stop_gradient(self.advantages_pl)
    eligibility = K.print_tensor(eligibility, message="eligibility: ")

    entropy = K.sum(modelout *
                    K.log(modelout + 1e-10), axis=1)
    entropy = K.print_tensor(entropy, message="entropy: ")

    loss = 0.001 * entropy - K.sum(eligibility)
    loss = K.print_tensor(loss, message="loss: ")

    updates = self.rms_optimizer.get_updates(loss=loss,
                                             params=self.model.trainable_weights)
    return K.function([self.model.input, self.action_pl, self.advantages_pl], [], updates=updates)

到目前为止很好,但是执行程序会产生以下控制台输出:

Shapes of vars: states: (999, 1, 44), actions: (999, 3), advantages: (999,)
action_pl: (None, 3)[[0.861626744 0.928109825 0.0259102583...]...]
model output: (None, 3)[[0.365334 0.333090335 0.301575601]]
Traceback (most recent call last):
  File ".\actor_critic.py", line 85, in <module>
    agent.train(marketSim, ac_args, summary_writer)
  File "FILEPATH", line 115, in train
    self.train_models(states, actions, rewards, done)
  File "FILEPATH", line 76, in train_models
    self.a_opt([states, actions, advantages])
  File "C:\Python37\lib\site-packages\tensorflow\python\keras\backend.py", line 3096, in __call__
    run_metadata=self.run_metadata)
  File "C:\Python37\lib\site-packages\tensorflow\python\client\session.py", line 1440, in __call__
    run_metadata_ptr)
  File "C:\Python37\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 548, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [1,44] vs. [1,3]
         [[{{node gradients/mul_grad/BroadcastGradientArgs}}]]
         [[Sum_1/_119]]

如您所见,批次大小为999,state的形状为(1,44),actions的形状为(3,)。从错误消息中,我猜想我会将这两个数乘以某个地方,但是我找不到发生的地方。 我也不明白为什么action_plmodel_output看起来(None, 3)具有相同的形状,但是对于model_output这显然是正确的,但打印出来的action_pl张量似乎实际上可能具有不同的形状(可能是1,44?),这使我完全困惑,因为我传递给函数的列表actions的形状肯定为(999,3)

我也不确定哪一行会真正产生错误:根据print_tensor行,我猜想weighted_actions = K.sum...行是因为在它下面(或更深的计算图)没有输出任何东西,但这可能是错误的。

tl; dr:a_opt()的哪一行实际上会产生错误,错误的形状[1,44]是从哪里来的,还有没有更好的方法来调试像这样的计算图?

0 个答案:

没有答案