我有一个对决的双重深层Q网络模型,该模型可用于两个密集层,并且当我的模型处理时间序列时,我尝试将其转换为两个LSTM层。当我在代码中更改密集层时,会出现此错误,我无法对其进行处理。我知道这个问题已经在这里解决了很多次,但是这些解决方案都行不通。
适用于两个密集层的代码如下:
class DuelingDeepQNetwork(keras.Model):
def __init__(self, n_actions, fc1_dims, fc2_dims):
super(DuelingDeepQNetwork, self).__init__()
self.dense1 = keras.layers.Dense(fc1_dims, activation='relu')
self.dense2 = keras.layers.Dense(fc2_dims, activation='relu')
self.V = keras.layers.Dense(1, activation=None)
self.A = keras.layers.Dense(n_actions, activation=None)
def call(self, state):
x = self.dense1(state)
x = self.dense2(x)
V = self.V(x)
A = self.A(x)
Q = (V + (A - tf.math.reduce_mean(A, axis=1, keepdims=True)))
return Q
def advantage(self, state):
x = self.dense1(state)
x = self.dense2(x)
A = self.A(x)
return A
它可以正常工作,但是当我如下将两个第一密集层转换为LSTM时:
class DuelingDeepQNetwork(keras.Model):
def __init__(self, n_actions, fc1_dims, fc2_dims):
super(DuelingDeepQNetwork, self).__init__()
self.dense1 = keras.layers.LSTM(fc1_dims, activation='relu')
self.dense2 = keras.layers.LSTM(fc2_dims, activation='relu')
self.V = keras.layers.Dense(1, activation=None)
self.A = keras.layers.Dense(n_actions, activation=None)
出现此错误:
lstm_24层的输入0与该层不兼容:预期ndim = 3,找到的ndim = 2。收到的完整图形:[64、8]
以下问题“ expected ndim=3, found ndim=2之后,我已经尝试在运行神经网络之前使用“ state = state.reshape(64,1,8)”来设置输入形状:
def choose_action(self, observation):
if np.random.random() < self.epsilon:
action = np.random.choice(self.action_space)
else:
state = np.array([observation])
state = state.reshape(64, 1, 8) #<--------
actions = self.q_eval.advantage(state)
action = tf.math.argmax(actions, axis=1).numpy()[0,0]
return action
但是我得到了完全相同的错误。我还尝试在两层中添加参数“ return_sequences = True”,但效果也不佳。
我不知道该怎么办,我必须在一周内上交,有人来启发我吗?
我正在使用fc1_dims = 64,fc2_dims = 32和n_actions =2。该模型使用8个变量,批处理大小为64。 我将代码上传到github中,因此可以根据需要执行它。该项目尚未完成,因此我暂时不会编写适当的自述文件。
[带有代码的github] [2]
答案 0 :(得分:0)
因此下面的代码对我来说没有任何问题。
class DuelingDeepQNetwork(keras.Model):
def __init__(self, n_actions, fc1_dims, fc2_dims):
super(DuelingDeepQNetwork, self).__init__()
self.dense1 = keras.layers.LSTM(fc1_dims, activation='relu', return_sequences=True)
self.dense2 = keras.layers.LSTM(fc2_dims, activation='relu')
self.V = keras.layers.Dense(1, activation=None)
self.A = keras.layers.Dense(n_actions, activation=None)
def call(self, state):
x = self.dense1(state)
x = self.dense2(x)
V = self.V(x)
A = self.A(x)
Q = (V + (A - tf.math.reduce_mean(A, axis=1, keepdims=True)))
return Q
def advantage(self, state):
x = self.dense1(state)
x = self.dense2(x)
A = self.A(x)
return A
然后按如下所示调用模型:
LSTMModel = DuelingDeepQNetwork(2, 64, 32)
LSTMModel.build(input_shape=(None,1,8))
LSTMModel.summary()
结果如下所示:
Model: "dueling_deep_q_network_7"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_12 (LSTM) multiple 18688
_________________________________________________________________
lstm_13 (LSTM) multiple 12416
_________________________________________________________________
dense_16 (Dense) multiple 33
_________________________________________________________________
dense_17 (Dense) multiple 66
=================================================================