Tensorflow服务无法找到GPU

时间:2017-09-21 05:45:06

标签: tensorflow keras tensorflow-serving tensorflow-gpu

我正在尝试将受过训练的keras模型提供给tensorflow服务。出口部分没问题,我用

  

使用tf.device('/ gpu:0'):

在加载模型之前。但是当我尝试提供它时,找不到GPU设备。

TF_CPP_MIN_VLOG_LEVEL=1 CUDA_VISIBLE_DEVICES=2 /home/diana/serving/bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server --port=9002 --model_name=ex_61 --model_base_path=/home/diana/code/Tf_Serving/ex_61_servable
2017-09-21 13:37:42.659616: I tensorflow_serving/model_servers/main.cc:147] Building single TensorFlow model file config:  model_name: ex_61 model_base_path: /home/diana/code/Tf_Serving/ex_61_servable
2017-09-21 13:37:42.659869: I tensorflow_serving/model_servers/server_core.cc:441] Adding/updating models.
2017-09-21 13:37:42.659905: I tensorflow_serving/model_servers/server_core.cc:492]  (Re-)adding model: ex_61
2017-09-21 13:37:42.660097: I tensorflow_serving/sources/storage_path/file_system_storage_path_source.cc:390] File-system polling update: Servable:{name: ex_61 version: 1}; Servable path: /home/diana/code/Tf_Serving/ex_61_servable/1; Polling frequency: 1
2017-09-21 13:37:42.661475: I tensorflow_serving/core/aspired_versions_manager.cc:235] Enqueueing aspired versions request: {name: ex_61 version: 1}
2017-09-21 13:37:42.760051: I tensorflow_serving/core/aspired_versions_manager.cc:245] Processing aspired versions request: {name: ex_61 version: 1}
2017-09-21 13:37:42.760097: I tensorflow_serving/core/aspired_versions_manager.cc:287] Adding {name: ex_61 version: 1} to BasicManager
2017-09-21 13:37:42.760116: I tensorflow_serving/core/basic_manager.cc:315] Request to start managing servable {name: ex_61 version: 1}
2017-09-21 13:37:42.760158: I tensorflow_serving/core/availability_preserving_policy.cc:77] AvailabilityPreservingPolicy requesting to load servable {name: ex_61 version: 1}
2017-09-21 13:37:42.760177: I tensorflow_serving/core/aspired_versions_manager.cc:341] Taking action: { action: 0 id: {name: ex_61 version: 1} }
2017-09-21 13:37:42.760190: I tensorflow_serving/core/basic_manager.cc:479] Request to load servable {name: ex_61 version: 1}
2017-09-21 13:37:42.760208: I tensorflow_serving/core/loader_harness.cc:57] Load requested for servable version {name: ex_61 version: 1}
2017-09-21 13:37:42.760450: I tensorflow_serving/core/basic_manager.cc:705] Successfully reserved resources to load servable {name: ex_61 version: 1}
2017-09-21 13:37:42.760487: I tensorflow_serving/core/loader_harness.cc:66] Approving load for servable version {name: ex_61 version: 1}
2017-09-21 13:37:42.760505: I tensorflow_serving/core/loader_harness.cc:74] Loading servable version {name: ex_61 version: 1}
2017-09-21 13:37:42.760535: I external/org_tensorflow/tensorflow/contrib/session_bundle/bundle_shim.cc:360] Attempting to load native SavedModelBundle in bundle-shim from: /home/diana/code/Tf_Serving/ex_61_servable/1
2017-09-21 13:37:42.760559: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:236] Loading SavedModel from: /home/diana/code/Tf_Serving/ex_61_servable/1
2017-09-21 13:37:42.791626: I external/org_tensorflow/tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2017-09-21 13:37:42.791667: I external/org_tensorflow/tensorflow/core/common_runtime/local_device.cc:40] Local device intra op parallelism threads: 12
2017-09-21 13:37:42.792050: I external/org_tensorflow/tensorflow/core/common_runtime/direct_session.cc:86] Direct session inter op parallelism threads: 12
2017-09-21 13:37:42.823835: I external/org_tensorflow/tensorflow/core/common_runtime/optimization_registry.cc:37] Running optimization phase 0
2017-09-21 13:37:42.831347: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:284] Loading SavedModel: fail. Took 70656 microseconds.
2017-09-21 13:37:42.831414: E tensorflow_serving/util/retrier.cc:38] Loading servable: {name: ex_61 version: 1} failed: Invalid argument: Cannot assign a device for operation 'save/StringJoin': Operation was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/cpu:0 ]. Make sure the device specification refers to a valid device.
     [[Node: save/StringJoin = StringJoin[N=2, _output_shapes=[[]], separator="", _device="/device:GPU:0"](save/Const, save/StringJoin/inputs_1)]]
注意最后一行,tensorflow服务找不到GPU设备。 如何解决这个问题呢?谢谢你

我的张量流环境:

(tensor27) diana@brick:~/code/Tf_Serving$ python
Python 2.7.13 |Continuum Analytics, Inc.| (default, Dec 20 2016, 23:09:15) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Anaconda is brought to you by Continuum Analytics.
Please check out: http://continuum.io/thanks and https://anaconda.org
>>> import tensorflow
>>> tensorflow.__path__
['/home/diana/anaconda3/envs/tensor27/lib/python2.7/site-packages/tensorflow']
>>> tensorflow.Session().run
2017-09-21 13:41:10.177715: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-09-21 13:41:10.177766: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-09-21 13:41:10.177786: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-09-21 13:41:10.177800: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-09-21 13:41:10.177811: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2017-09-21 13:41:10.528596: I tensorflow/core/common_runtime/gpu/gpu_device.cc:955] Found device 0 with properties: 
name: GeForce GTX 1080
major: 6 minor: 1 memoryClockRate (GHz) 1.835
pciBusID 0000:05:00.0
Total memory: 7.92GiB
Free memory: 7.81GiB
2017-09-21 13:41:10.528628: I tensorflow/core/common_runtime/gpu/gpu_device.cc:976] DMA: 0 
2017-09-21 13:41:10.528634: I tensorflow/core/common_runtime/gpu/gpu_device.cc:986] 0:   Y 
2017-09-21 13:41:10.528650: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1045] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:05:00.0)

1 个答案:

答案 0 :(得分:0)

如果您的问题仍未解决,或者对于其他遇到相同问题的人,请尝试按照here所述设置docker。

用于连接到tfserving服务器here的GRPC客户端代码

  

免责声明:以上讨论由我发表。经过测试,可以在这里分享。