我想通过使用一段很久以前写的代码来训练带有Tensorflow和keras的Q学习的TensorFlow模型。但是在执行它时,我收到此错误:-
使用TensorFlow后端。 追溯(最近一次通话): 文件“ C:\ Users \ neelg \ TF2_GPU \ lib \ site-packages \ tensorflow_core \ python \ eager \ execute.py”, 第206行,位于make_shape中 形状= tensor_shape.as_shape(v) 文件“ C:\ Users \ neelg \ TF2_GPU \ lib \ site-packages \ tensorflow_core \ python \ framework \ tensor_shape.py”, 1216行,格式为as_shape 返回TensorShape(形状) 文件“ C:\ Users \ neelg \ TF2_GPU \ lib \ site-packages \ tensorflow_core \ python \ framework \ tensor_shape.py”, 第776行,在 init 中 self._dims = [dims_iter中d的as_dimension(d)] 文件“ C:\ Users \ neelg \ TF2_GPU \ lib \ site-packages \ tensorflow_core \ python \ framework \ tensor_shape.py”, 776行,在 self._dims = [dims_iter中d的as_dimension(d)] 文件“ C:\ Users \ neelg \ TF2_GPU \ lib \ site-packages \ tensorflow_core \ python \ framework \ tensor_shape.py”, 第718行,按as_dimension 返回维度(值) 文件“ C:\ Users \ neelg \ TF2_GPU \ lib \ site-packages \ tensorflow_core \ python \ framework \ tensor_shape.py”, 第193行, init self._value = int(值) TypeError:int()参数必须是字符串,类似字节的对象或数字,而不是'TimeLimit'
During handling of the above exception, another exception occurred:
回溯(最近通话最近): 文件“ C:\ Users \ neelg \ Documents \ Atom_projects \ Main \ test.py”,第66行,在 代理= DQNAgent(env,450) init 中的文件“ C:\ Users \ neelg \ Documents \ Atom_projects \ Main \ test.py”,第25行 self.model = self._build_model() _build_model中的文件“ C:\ Users \ neelg \ Documents \ Atom_projects \ Main \ test.py”,第31行 model.add(密集(24,input_dim = self.state_size,激活='relu')) 文件“ C:\ Users \ neelg \ TF2_GPU \ lib \ site-packages \ keras \ engine \ sequential.py”, 第162行,添加 name = layer.name +'_input') 文件“ C:\ Users \ neelg \ TF2_GPU \ lib \ site-packages \ keras \ engine \ input_layer.py”, 输入中的第178行 input_tensor =张量) 文件“ C:\ Users \ neelg \ TF2_GPU \ lib \ site-packages \ keras \ legacy \ interfaces.py”, 第91行,在包装器中 return func(* args,** kwargs) 文件“ C:\ Users \ neelg \ TF2_GPU \ lib \ site-packages \ keras \ engine \ input_layer.py”, 第87行,初始化 名称= self.name) 文件“ C:\ Users \ neelg \ TF2_GPU \ lib \ site-packages \ keras \ backend \ tensorflow_backend.py”, 第75行,在symbolic_fn_wrapper中 return func(* args,** kwargs) 文件“ C:\ Users \ neelg \ TF2_GPU \ lib \ site-packages \ keras \ backend \ tensorflow_backend.py”, 第736行,在占位符中 shape =形状,ndim = ndim,dtype = dtype,稀疏=稀疏,name = name) 文件“ C:\ Users \ neelg \ TF2_GPU \ lib \ site-packages \ tensorflow_core \ python \ keras \ backend.py”, 行1057,在占位符中 x = array_ops.placeholder(dtype,shape = shape,name = name) 文件“ C:\ Users \ neelg \ TF2_GPU \ lib \ site-packages \ tensorflow_core \ python \ ops \ array_ops.py”, 第2630行,在占位符中 返回gen_array_ops.placeholder(dtype = dtype,shape = shape,name = name) 文件“ C:\ Users \ neelg \ TF2_GPU \ lib \ site-packages \ tensorflow_core \ python \ ops \ gen_array_ops.py”, 行6669,在占位符中 形状= _execute.make_shape(形状,“形状”) 文件“ C:\ Users \ neelg \ TF2_GPU \ lib \ site-packages \ tensorflow_core \ python \ eager \ execute.py”, 第208行,在make_shape中 引发TypeError(“将%s转换为TensorShape时出错:%s。”%(arg_name,e)) TypeError:将形状转换为TensorShape时出错:int()参数必须是字符串,类似字节的对象或数字,而不是'TimeLimit'。
我看不到“ TimeLimit”的任何用法,因此不确定发生了什么。这个错误对我造成了严重的延误。我正在使用Tensorflow 2.0-gpu,但在正常Tensorflow(stock)的版本上也遇到了相同的错误。任何帮助将不胜感激:)
这是我的代码:===>
"""
Created on Mon Oct 14 22:01:50 2019
@author: neel
"""
import numpy as np
import tensorflow as tf
from tensorflow import keras
from keras.models import Sequential
from keras.layers.core import Dense
from collections import deque
import gym
class DQNAgent:
def __init__(self, state_size, action_size):
self.state_size = state_size
self.action_size = action_size
self.memory = deque(maxlen=2000)
self.gamma = 0.95 # discount rate
self.epsilon = 1.0 # exploration rate
self.epsilon_min = 0.01
self.epsilon_decay = 0.995
self.learning_rate = 0.001
self.model = self._build_model()
def _build_model(self):
# Neural Net for Deep-Q learning Model
model = Sequential()
model.add(Dense(24, input_dim=self.state_size, activation='relu'))
model.add(Dense(24, activation='relu'))
model.add(Dense(self.action_size, activation='linear'))
model.compile(loss='mse',
optimizer=Adam(lr=self.learning_rate))
return model
def remember(self, state, action, reward, next_state, done):
self.memory.append((state, action, reward, next_state, done))
def act(self, state):
if np.random.rand() <= self.epsilon:
return random.randrange(self.action_size)
act_values = self.model.predict(state)
return np.argmax(act_values[0]) # returns action
def replay(self, batch_size):
minibatch = random.sample(self.memory, batch_size)
for state, action, reward, next_state, done in minibatch:
target = reward
if not done:
target = reward + self.gamma * \
np.amax(self.model.predict(next_state)[0])
target_f = self.model.predict(state)
target_f[0][action] = target
self.model.fit(state, target_f, epochs=1, verbose=0)
if self.epsilon > self.epsilon_min:
self.epsilon *= self.epsilon_decay
if __name__ == "__main__":
# initialize gym environment and the agent
env = gym.make('CartPole-v0')
agent = DQNAgent(env, 450)
env.render()
# Iterate the game
for e in range(episodes):
# reset state in the beginning of each game
state = env.reset()
state = np.reshape(state, [1, 4])
# time_t represents each frame of the game
# Our goal is to keep the pole upright as long as possible until score of 500
# the more time_t the more score
for time_t in range(500):
# turn this on if you want to render
env.render()
# Decide action
action = agent.act(state)
# Advance the game to the next frame based on the action.
# Reward is 1 for every frame the pole survived
next_state, reward, done, _ = env.step(action)
next_state = np.reshape(next_state, [1, 4])
# Remember the previous state, action, reward, and done
agent.remember(state, action, reward, next_state, done)
# make next_state the new current state for the next frame.
state = next_state
# done becomes True when the game ends
# ex) The agent drops the pole
if done:
# print the score and break out of the loop
print("episode: {}/{}, score: {}"
.format(e, episodes, time_t))
break
# train the agent with the experience of the episode
agent.replay(32)
BTW由于我将CUDA与Nvidia卡一起使用,因此我认为这是产生长时间错误的原因