Question

我正在尝试修改DDPG算法，以控制手臂的肌肉骨骼模型。标准模型运行良好，但我希望手目标位置早于肌肉长度反馈进入网络。这是我的网络图表，看起来可以编译。
As you can see, I use lambda layers to split out the input to the network into 47 muscle inputs (left branch) and 3 hand target coordinates (right branch). I am able to run data through this model feedforward.

当我尝试使用参与者和评论者来编译我的DDPG算法以计算策略梯度时，就会出现问题。 The graph of the critic can be seen here

这是执行此操作的代码

# Combine actor and critic so that we can get the policy gradient.
# Assuming critic's state inputs are the same as actor's.

    combined_inputs = []
    critic_inputs = []
    for i in self.critic.input:
    if i == self.critic_action_input:
    combined_inputs.append([])
    else:
    combined_inputs.append(i)
    critic_inputs.append(i)
    combined_inputs[self.critic_action_input_idx] = self.actor(critic_inputs)

错误发生在最后一行，
ValueError: Dimensions must be equal, but are 51 and 47 for 'model_2/dense_9/MatMul' (op: 'MatMul') with input shapes: [0,51], [47,31].

从我的图表来看，这对我来说没有意义。输入空间为(None, 51)，输入大小为

critic_inputs
[<tf.Tensor 'observation_input_4:0' shape=(?, 1, 51) dtype=float32>]

这适用于我的简单模型。
任何建议将不胜感激。

Keras DDPG输入大小问题

0 个答案: