我想让计算机学习如何在小范围内行驶(288,512)。就像从上方看的迷宫。 我的模型是:
main_input = Input(shape=(1, 576, 1024), dtype='float32', name='main_input')
x = Cropping2D((188, 412), input_shape=(1,) + env.observation_space.shape, data_format="channels_first")(main_input)
x = Conv2D(32, (8, 8), strides=(4, 4), data_format="channels_first", activation='relu')(x)
x = MaxPooling2D(pool_size=(3, 3), strides=(2, 2))(x)
x = Conv2D(64, (4, 4), strides=(2, 2), activation='relu')(x)
x = MaxPooling2D(pool_size=(2, 2))(x)
x = Conv2D(64, (3, 3), activation='relu')(x)
x = MaxPooling2D(pool_size=(1, 1))(x)
x = Flatten()(x)
merge = Flatten()(vec)
x = keras.layers.concatenate([x, merge, velo])
x = Dense(10, activation='relu')(x)
x = BatchNormalization()(x)
x = Dense(nb_actions, activation='linear')(x)
# Finally, we configure and compile our agent. You can use every built-in Keras optimizer and
# even the metrics!
memory = SequentialMemory(limit=50000, window_length=1)
# stitch together
model = Model([main_input, vec, velo], x)
model.summary()
#policy = BoltzmannQPolicy(tau=1.)
policy = LinearAnnealedPolicy(EpsGreedyQPolicy(), attr='eps', value_max=1., value_min=.1, value_test=.05,
nb_steps=100)
processor = MultiInputProcessor(3)
dqn = DQNAgent(model=model, nb_actions=nb_actions, memory=memory, processor=processor, nb_steps_warmup=40,
target_model_update=100, policy=policy)
dqn.compile(Adam(lr=0.0025), metrics=['mae'])
dqn.load_weights(f'dqn_{ENV_NAME}_weights-{model_version}.h5f')
# Okay, now it's time to learn something! We visualize the training here for show, but this
# slows down training quite a lot. You can always safely abort the training prematurely using
# Ctrl + C.
dqn.fit(env, nb_steps=100, visualize=False, verbose=0, action_repetition=3, nb_max_episode_steps=100)
#dqn.test(env, nb_episodes=10, visualize=False)
# After training is done, we save the final weights.
dqn.save_weights(f'dqn_{ENV_NAME}_weights-{model_version}.h5f', overwrite=True)
它获得576x1024灰度图片并将其裁剪为200x200 速度和方向...向量从一帧到一帧再到长度...
,但似乎一点都学不到。 有人知道为什么吗?