我遵循Denny Britz's TF implementation作为准则,使用keras从零开始实现了基本的REINFORCE算法。我正在尝试使用多代理设置,一次训练两个代理(发送者和接收者)。对于两个代理,train_on_batch
的第一次调用运行都没有问题。但是,当第二次调用train_on_batch
时,它会崩溃,并引发以下错误:
FailedPreconditionError: Error while reading resource variable _AnonymousVar18 from Container:
localhost. This could mean that the variable was uninitialized. Not found: Resource localhost/_AnonymousVar18/class
tensorflow::Var does not exist.
[[node mul_3/ReadVariableOp (defined at D:\ProgramData\Anaconda3\envs\jupyter-mess\lib\site-packages\keras\backend\tensorflow_backend.py:3014) ]]
[Op:__inference_keras_scratch_graph_1235]
Function call stack:
keras_scratch_graph
这里是Sender模型,这是一个具有自定义丢失的简单前馈网络。接收器网络非常相似。
class Sender:
def __init__(self, n_images, input_image_shape, embedding_size, vocabulary_size, temperature):
self.reset_memory()
image_inputs = [layers.Input(shape=input_image_shape, dtype="float32")
for i in range(n_images)]
image_embedding_layer = layers.Dense(embedding_size)
sigmoid = layers.Activation("sigmoid")
output_layer = layers.Dense(vocabulary_size)
temperature_layer = layers.Lambda(lambda x: x / temperature)
softmax = layers.Softmax()
y = [image_embedding_layer(x) for x in image_inputs]
y = [sigmoid(x) for x in y]
y = layers.concatenate(y, axis=-1)
y = output_layer(y)
y = temperature_layer(y)
y = softmax(y)
self.model = Model(image_inputs, y)
index = layers.Input(shape=[1], dtype="int32")
y_selected = layers.Lambda(
lambda probs_index: tf.gather(*probs_index, axis=-1),
)([y, index])
def loss(target, prediction):
return - K.log(prediction) * target
self.model_train = Model([*image_inputs, index], y_selected)
self.model_train.compile(loss=loss, optimizer=OPTIMIZER)
def predict(self, state):
return self.model.predict_on_batch(x=state)
def update(self, state, action, target):
x = [*state, action]
return self.model_train.train_on_batch(x=x, y=target)
这是执行:
for episode in range(1, N_EPISODES + 1):
game.reset()
sender_state = game.get_sender_state(n_images=N_IMAGES, unique_categories=True, expand=True)
sender_probs = sender.predict(state=sender_state)
sender_probs = np.squeeze(sender_probs)
sender_action = np.random.choice(np.arange(len(sender_probs)), p=sender_probs)
receiver_state = game.get_receiver_state(sender_action, expand=True)
receiver_probs = receiver.predict(receiver_state)
receiver_probs = np.squeeze(receiver_probs)
receiver_action = np.random.choice(np.arange(len(receiver_probs)), p=receiver_probs)
sender_reward, receiver_reward, success = game.evaluate_guess(receiver_action)
sender.update(sender_state, np.asarray([sender_action]), np.asarray([sender_reward]))
receiver.update(receiver_state, np.asarray([receiver_action]), np.asarray([receiver_reward]))
当我在使用 keras 2.3.1 的jupyter笔记本上本地运行该实现时,该实现会崩溃,但在使用 keras 2.4.3 的Google Colab上,该实现运行良好。
此外,当我禁用其中一个代理程序时,该程序对于另一个代理程序运行时没有错误,但是我无法以这种方式进行训练。
是否可以查看导致故障的变量?我是否需要为每个代理或类似的东西运行单独的会话?