python脚本正在使用20%的CPU,而仅使用0.3%的GPU。 我对Keras和TensorFlow还是很陌生,所以我做错什么根本不可能,这会导致这种过度的缓慢。该程序应运行15000个训练情节,每个情节包含400个步骤。没有使用图像或花哨的东西,状态只是3个元素的简单列表。以目前的速度,要播放全部15000集,大约需要100个小时。如果有人可以帮助我,我真的很想确定TensorFlow是否正在使用我的GPU。 此外,如果有人对我的自定义训练循环有任何意见,我也将不胜感激。
-自定义训练循环--
for episode in range(1,num_training_episodes+1):
done = False
episode_reward = 0
state = env.reset_environment(sat)
for step in range(num_steps):
action = agent.select_action(state, policy_net)
new_state, episode_reward, done = env.step(sat, action)
memory.push(Experience(state.copy(), action, new_state.copy(), episode_reward))
state = new_state
if memory.can_provide_sample(batch_size):
states, actions, new_states, rewards = memory.sample(batch_size)
next_q_values = target_net.predict(new_states)
target_q_values = rewards +gamma*tf.reduce_max(next_q_values, axis = 1)
masks = tf.one_hot(actions, num_actions)
with tf.GradientTape() as tape:
q_values = policy_net(np.array([states]))[0]
q_action = tf.reduce_sum(tf.multiply(q_values, masks), axis = 1)
loss = loss_func(target_q_values, q_action)
grads = tape.gradient(loss, policy_net.trainable_variables)
optimizer.apply_gradients(zip(grads, policy_net.trainable_variables))
if (done == True) or (step == num_steps-1):
reward_list.append(episode_reward)
break
if (episode % target_update) == 0:
for layer_policy,layer_target in zip(policy_net.layers,target_net.layers):
layer_target.set_weights(layer_policy.get_weights())
if (episode % print_every) == 0:
print("Episode:", episode-print_every, "to:", episode, "Number of correct stops:", env.correct_stop_counter, "Mean rewards:", np.mean(reward_list[episode-print_every:episode]))
---控制台输出---
2020-09-30 21:47:06.446861: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2020-09-30 21:47:08.497370: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library nvcuda.dll
2020-09-30 21:47:08.527588: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 1060 computeCapability: 6.1
coreClock: 1.6705GHz coreCount: 10 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 178.99GiB/s
2020-09-30 21:47:08.528543: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2020-09-30 21:47:08.534707: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll
2020-09-30 21:47:08.540359: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cufft64_10.dll
2020-09-30 21:47:08.542732: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library curand64_10.dll
2020-09-30 21:47:08.548640: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusolver64_10.dll
2020-09-30 21:47:08.552110: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusparse64_10.dll
2020-09-30 21:47:08.565173: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll
2020-09-30 21:47:08.565677: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
Num GPUs Available: 1
2020-09-30 21:47:08.576341: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2020-09-30 21:47:08.588414: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x18df008a400 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-09-30 21:47:08.589123: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-09-30 21:47:08.590160: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 1060 computeCapability: 6.1
coreClock: 1.6705GHz coreCount: 10 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 178.99GiB/s
2020-09-30 21:47:08.590965: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2020-09-30 21:47:08.591530: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll
2020-09-30 21:47:08.591898: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cufft64_10.dll
2020-09-30 21:47:08.592195: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library curand64_10.dll
2020-09-30 21:47:08.592657: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusolver64_10.dll
2020-09-30 21:47:08.593051: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusparse64_10.dll
2020-09-30 21:47:08.593582: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll
2020-09-30 21:47:08.594160: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
2020-09-30 21:47:09.346067: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-09-30 21:47:09.346584: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263] 0
2020-09-30 21:47:09.346967: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0: N
2020-09-30 21:47:09.347409: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4826 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060, pci bus id: 0000:01:00.0, compute capability: 6.1)
2020-09-30 21:47:09.352153: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x18da6c97700 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-09-30 21:47:09.352903: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce GTX 1060, Compute Capability 6.1
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 24) 96
_________________________________________________________________
dense_1 (Dense) (None, 48) 1200
_________________________________________________________________
dense_2 (Dense) (None, 24) 1176
_________________________________________________________________
dense_3 (Dense) (None, 5) 125
=================================================================
Total params: 2,597
Trainable params: 2,597
Non-trainable params: 0
_________________________________________________________________
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_4 (Dense) (None, 24) 96
_________________________________________________________________
dense_5 (Dense) (None, 48) 1200
_________________________________________________________________
dense_6 (Dense) (None, 24) 1176
_________________________________________________________________
dense_7 (Dense) (None, 5) 125
=================================================================
Total params: 2,597
Trainable params: 2,597
Non-trainable params: 0
_________________________________________________________________
2020-09-30 21:47:09.444165: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll