Question

我需要使用tensorflow的对象检测API来获取相对较大的图像集（约103,000张图像）的边界框，每个图像的大小为256 x 256。为了在合理的时间内做到这一点，我要求大学提供一台8-GPU服务器。

我用来在计算机（CPU）上获得输出的代码如下：

#main function
def run_inference(images, graph):
  with graph.as_default():
    # Get output tensors
    tensor_dict = {}
    for key in ['num_detections', 'detection_boxes', 'detection_scores', 'detection_classes']:
      tensor_name = key + ':0'
      tensor_dict[key] = tf.get_default_graph().get_tensor_by_name(tensor_name)

    #Get input tensor
    image_tensor = tf.get_default_graph().get_tensor_by_name('image_tensor:0')

    with tf.Session() as sess:
      # Run inference
      output_dict = sess.run(tensor_dict, feed_dict={image_tensor: images})

如何编辑代码，以便在不同的GPU上并行计算不同的图像？另外，这是否会明显缩短运行时间，还是瓶颈在哪里？

使用多个GPU进行测试

0 个答案: