我试图使用Keras对(1000000,12,1)数据进行机器学习,我同时尝试了Cpu(AMD R5-2600@3.88GHz)和GPU(RX580 @ 1411MHz)。
但是Cpu和GPU之间的速度完全相同。
我认为我已经正确安装了ROCm2.6和tensorflow-rocm1.13.3,程序可以正常运行了。
我做错了什么吗,或者我可以做些什么来使GPU训练更快?
我正在使用Tensorflow后端,并在终端上运行它,并安装了anaconda和python3.5。
Using TensorFlow backend.
2019-07-13 03:11:15.890156: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2019-07-13 03:11:15.920477: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1531] Found device 0 with properties:
name: Ellesmere [Radeon RX 470/480/570/570X/580/580X]
AMDGPU ISA: gfx803
memoryClockRate (GHz) 1.411
pciBusID 0000:0a:00.0
Total memory: 8.00GiB
Free memory: 7.75GiB
2019-07-13 03:11:15.920532: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1642] Adding visible gpu devices: 0
2019-07-13 03:11:15.920570: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-07-13 03:11:15.920591: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1059] 0
2019-07-13 03:11:15.920610: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1072] 0: N
2019-07-13 03:11:15.920696: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1189] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7539 MB memory) -> physical GPU (device: 0, name: Ellesmere [Radeon RX 470/480/570/570X/580/580X], pci bus id: 0000:0a:00.0)
['/job:localhost/replica:0/task:0/device:GPU:0']
WARNING:tensorflow:From /home/kenchou/anaconda3/envs/mlgpu/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From /home/kenchou/anaconda3/envs/mlgpu/lib/python3.5/site-packages/tensorflow/python/keras/utils/losses_utils.py:170: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
(1000000, 12) (1000000,)
WARNING:tensorflow:From /home/kenchou/anaconda3/envs/mlgpu/lib/python3.5/site-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
Train on 700000 samples, validate on 300000 samples
情况看起来还不错,但是速度很慢