Question

我对openAi环境还很陌生，基本上我正在使用https://github.com/ugo-nama-kun/gym_torcs/tree/master/vtorcs-RL-color尝试不同的强化学习代理。

因此，我写下了自己的Reinforce和GPOMDP代理，首先在其中创建环境

 env = TorcsEnv(vision=vision, throttle=False)

，然后将方法调用为env.reset（），env.step（）... 一切正常，培训过程开始顺利。

现在，我想在此Gym-torcs env中尝试基准库（https://github.com/openai/baselines），因此我以https://github.com/openai/baselines/blob/master/baselines/trpo_mpi/run_mujoco.py为例，替代了

env = make_mujoco_env(env_id, workerseed)

与

 env = TorcsEnv(vision=vision, throttle=False)

Torcs已正确启动，但是当汽车应该开始行驶时，我遇到了以下错误：

Traceback (most recent call last):
File "myAgent.py", line 39, in <module>
main()
File "myAgent.py", line 35, in main
 train(args.env, num_timesteps=args.num_timesteps, seed=args.seed)
File "myAgent.py", line 30, in train
 max_timesteps=1000, gamma=0.99, lam=0.98, vf_iters=5, vf_stepsize=1e- 
 3)
File "/usr/src/baselines/baselines/trpo_mpi/trpo_mpi.py", line 199, in 
 learn
 seg = seg_gen.__next__()
File "/usr/src/baselines/baselines/trpo_mpi/trpo_mpi.py", line 36, in 
 traj_segment_generator
 ac, vpred = pi.act(stochastic, ob)
File "/usr/src/baselines/baselines/ppo1/mlp_policy.py", line 54, in 
 act
 ac1, vpred1 =  self._act(stochastic, ob[None])
File "/usr/src/baselines/baselines/common/tf_util.py", line 194, in 
  __call__
 results = tf.get_default_session().run(self.outputs_update, 
 feed_dict=feed_dict)[:-1]
File "/usr/local/lib/python3.5/dist- 
 packages/tensorflow/python/client/session.py", line 900, in run
 run_metadata_ptr)
File "/usr/local/lib/python3.5/dist- 
 packages/tensorflow/python/client/session.py", line 1104, in _run
 np_val = np.asarray(subfeed_val, dtype=subfeed_dtype)
File "/home/nicolobrunello/.local/lib/python3.5/site- 
  packages/numpy/core/numeric.py", line 492, in asarray
return array(a, dtype, copy=False, order=order)
ValueError: setting an array element with a sequence.

有人知道我应该如何将Baseline与Gym-torcs集成在一起吗？

P.S .：我正在使用python 3.5.2和Ubuntu 64位16.04.4

在自定义健身环境（gym-torcs）上使用Baselines库时出错

0 个答案: