Question

python脚本正在使用20％的CPU，而仅使用0.3％的GPU。我对Keras和TensorFlow还是很陌生，所以我做错什么根本不可能，这会导致这种过度的缓慢。该程序应运行15000个训练情节，每个情节包含400个步骤。没有使用图像或花哨的东西，状态只是3个元素的简单列表。以目前的速度，要播放全部15000集，大约需要100个小时。如果有人可以帮助我，我真的很想确定TensorFlow是否正在使用我的GPU。此外，如果有人对我的自定义训练循环有任何意见，我也将不胜感激。

-自定义训练循环--

for episode in range(1,num_training_episodes+1):
    done = False
    episode_reward = 0
    state = env.reset_environment(sat)
    for step in range(num_steps):

        action = agent.select_action(state, policy_net)
        new_state, episode_reward, done = env.step(sat, action)
        memory.push(Experience(state.copy(), action, new_state.copy(), episode_reward))
        state = new_state

        if memory.can_provide_sample(batch_size):

            states, actions, new_states, rewards = memory.sample(batch_size)

            next_q_values = target_net.predict(new_states)
            target_q_values = rewards +gamma*tf.reduce_max(next_q_values, axis = 1)

            masks = tf.one_hot(actions, num_actions)

            with tf.GradientTape() as tape:
                q_values = policy_net(np.array([states]))[0]

                q_action = tf.reduce_sum(tf.multiply(q_values, masks), axis = 1)

                loss = loss_func(target_q_values, q_action)

            grads = tape.gradient(loss, policy_net.trainable_variables)
            optimizer.apply_gradients(zip(grads, policy_net.trainable_variables))

        if (done == True) or (step == num_steps-1):
            reward_list.append(episode_reward)
            break

    if (episode % target_update) == 0:
        for layer_policy,layer_target in zip(policy_net.layers,target_net.layers):
            layer_target.set_weights(layer_policy.get_weights())

    if (episode % print_every) == 0:
        print("Episode:", episode-print_every, "to:", episode, "Number of correct stops:", env.correct_stop_counter, "Mean rewards:", np.mean(reward_list[episode-print_every:episode]))

---控制台输出---

2020-09-30 21:47:06.446861: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2020-09-30 21:47:08.497370: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library nvcuda.dll
2020-09-30 21:47:08.527588: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce GTX 1060 computeCapability: 6.1
coreClock: 1.6705GHz coreCount: 10 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 178.99GiB/s
2020-09-30 21:47:08.528543: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2020-09-30 21:47:08.534707: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll
2020-09-30 21:47:08.540359: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cufft64_10.dll
2020-09-30 21:47:08.542732: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library curand64_10.dll
2020-09-30 21:47:08.548640: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusolver64_10.dll
2020-09-30 21:47:08.552110: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusparse64_10.dll
2020-09-30 21:47:08.565173: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll
2020-09-30 21:47:08.565677: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
Num GPUs Available:  1
2020-09-30 21:47:08.576341: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations:  AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2020-09-30 21:47:08.588414: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x18df008a400 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-09-30 21:47:08.589123: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-09-30 21:47:08.590160: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce GTX 1060 computeCapability: 6.1
coreClock: 1.6705GHz coreCount: 10 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 178.99GiB/s
2020-09-30 21:47:08.590965: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2020-09-30 21:47:08.591530: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll
2020-09-30 21:47:08.591898: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cufft64_10.dll
2020-09-30 21:47:08.592195: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library curand64_10.dll
2020-09-30 21:47:08.592657: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusolver64_10.dll
2020-09-30 21:47:08.593051: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusparse64_10.dll
2020-09-30 21:47:08.593582: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll
2020-09-30 21:47:08.594160: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
2020-09-30 21:47:09.346067: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-09-30 21:47:09.346584: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263]      0 
2020-09-30 21:47:09.346967: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0:   N 
2020-09-30 21:47:09.347409: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4826 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060, pci bus id: 0000:01:00.0, compute capability: 6.1)
2020-09-30 21:47:09.352153: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x18da6c97700 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-09-30 21:47:09.352903: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): GeForce GTX 1060, Compute Capability 6.1
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense (Dense)                (None, 24)                96        
_________________________________________________________________
dense_1 (Dense)              (None, 48)                1200      
_________________________________________________________________
dense_2 (Dense)              (None, 24)                1176      
_________________________________________________________________
dense_3 (Dense)              (None, 5)                 125       
=================================================================
Total params: 2,597
Trainable params: 2,597
Non-trainable params: 0
_________________________________________________________________
Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_4 (Dense)              (None, 24)                96        
_________________________________________________________________
dense_5 (Dense)              (None, 48)                1200      
_________________________________________________________________
dense_6 (Dense)              (None, 24)                1176      
_________________________________________________________________
dense_7 (Dense)              (None, 5)                 125       
=================================================================
Total params: 2,597
Trainable params: 2,597
Non-trainable params: 0
_________________________________________________________________
2020-09-30 21:47:09.444165: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll

不确定tensorflow是否在GPU或CPU自定义keras训练循环上运行，或者它是否与训练循环代码有关

0 个答案: