在这里,我尝试实施深度Q学习。该程序包含三个部分:env,agent,deploy,其中env是一个自定义的环境,例如健身房,agent定义了神经网络,并且部署模块接受(env,agent)作为输入并实施培训和测试。提出了我的问题的最小代码
class DQL_Agent():
def __init__(self, **kwargs):
# args read in
self._train_mode = kwargs.get('train_mode', True)
self._model_path = kwargs.get('model_path', None)
# initialize model according to mode
if self._train_mode:
self._model = self.build_model()
else:
self._model = self.restore_model()
def build_model(self):
model = Sequential()
model.add(Dense(64, input_shape=(self._state_size,), activation='relu', kernel_initializer='he_uniform'))
model.add(Dropout(0.5))
model.add(Dense(32, activation='relu', kernel_initializer='he_uniform'))
model.add(Dropout(0.5))
model.add(Dense(self._action_size, activation='linear', kernel_initializer='he_uniform'))
# optimizer
rms = RMSprop()
adam = Adam(lr=self._lr)
# model compile
model.compile(loss='mse', optimizer=adam)
return model
def predict(self, X_test):
return self._model.predict(X_test)
def restore_model(self):
return load_model()
而部署模块是
class Deploy(object):
def __init__(self, env, agent, **kwargs):
self._env = env
self._agent = agent
self._gpu_fraction = kwargs.get('gpu_fraction', 0.25)
self._gpu_device = kwargs.get('gpu_device', -1)
self._train_mode = agent._train_mode
# system platform configuration
K.set_session(self.get_session())
def get_session(self):
if self._gpu_device == -1: # CPU mode
return tf.Session()
else:
os.environ['CUDA_VISIBLE_DEVICES'] = str(self._gpu_device)
gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=self._gpu_fraction)
return tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))
def run(self, epochs=1000, **kwargs):
if self._train_mode:
pass
else:
state = env.reset()
return self._agent.predict(state)
env模块的API与Gym相同,并且工作正常。
现在的问题是,当我运行以下代码时
agent = DQL_Agent(train_mode=False, file_path='model.h5') # testing mode
deploy = Deploy(env, agent)
x = deploy.run() # here comes the error
我会收到一个错误
FailedPreconditionError (see above for traceback): Attempting to use uninitialized value dense_3/bias
[[Node: dense_3/bias/read = Identity[T=DT_FLOAT, _class=["loc:@dense_3/bias"], _device="/job:localhost/replica:0/task:0/device:CPU:0"](dense_3/bias)]]
它告诉我层未初始化。但是,由于我们处于测试模式,即train_mode=False
,因此实际上我们加载了现有模型model.h5
并将其传递给deploy
类。无需初始化。
实际上,问题似乎来自K.set_session
和get_session
,它们用于设置GPU使用限制。如果我在K.set_session(self.get_session())
类中删除了行deploy
,那么就不会再有错误了。
那么,如何解释和解决它。