标签: reinforcement-learning openai-gym
是在强化学习环境(如OpenAI体育馆)中随机选择的初始状态。换句话说,命令env.reset()会导致随机选择的初始状态还是特定的初始状态?
答案 0 :(得分:2)
通常是的,它是随机的。但是,最好确保查看环境的源代码。例如the pendulum initial state is uniformly drawn from the whole state space,而for the mountain car the state position is uniformly drawn from [-0.6, -0.4] and the velocity is always 0。