Question

我有一个带有AMI的EC2（p2.xlarge）

深度学习AMI（Ubuntu）5.0版 - ami-7336d50e

预装了最新的深度学习框架二进制文件   在单独的虚拟环境中：MXNet，TensorFlow，Caffe，Caffe2，   PyTorch，Keras，Chainer，Theano和CNTK。完全配置NVidia   CUDA，cuDNN和NCCL

当我开始我的程序时，我尝试使用keras制作rnn

 I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.7.5 locally
 I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally
 I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.7.5 locally
 I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally
 I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.7.5 locally

在卡拉斯开始之后，我有了这个

W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:910] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties:
name: Tesla K80
major: 3 minor: 7 memoryClockRate (GHz) 0.8235
pciBusID 0000:00:1e.0
Total memory: 11.17GiB
Free memory: 11.10GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0:   Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K80, pci bus id: 0000:00:1e.0)
I tensorflow/core/common_runtime/gpu/pool_allocator.cc:247] PoolAllocator: After 12639 get requests, put_count=6277 evicted_count=1000 eviction_rate=0.159312 and unsatisfied allocation rate=0.590395
I tensorflow/core/common_runtime/gpu/pool_allocator.cc:259] Raising pool_size_limit_ from 100 to 110

但是当程序学习不快时我的macbookpro比我的EC2快，我在每个时期后都有这个警告

tensorflow/core/common_runtime/gpu/pool_allocator.cc:247] PoolAllocator: After 4156 get requests, put_count=8233 evicted_count=4000 eviction_rate=0.48585 and unsatisfied allocation rate=0.000481232

我安装了karas_gpu和tensorflow_gpu，我使用vm for keras2 with tensorflow

如果我做错了什么你可以告诉我，这样一个简单的小macbook可以比使用这个规范的EC2更快

p2.xlarge（11.75 ECU，4个vCPU，2.7 GHz，E5-2686v4,61Giomémoire，EBS uniquement）

Answer 1

回复很简单。在EC2 AMI（p2.xlarge）中，使用TensorFlow的gpu是Tesla K80，这个gpu加速4x~10x cpu，在我的macbook中我有8个cpu。

aws ec2 tensorflow gpu无法正常工作

1 个答案: