Question

我在Google AI平台上训练了对象检测模型并下载了该模型。这是一个标准的save_model.pb文件，我想在python中加载该文件并提供一张图片进行推断。问题是该模型的输入定义为encoded_image_string_tensor，它期望使用base64编码的字符串。如何在python中以这种格式编码图像文件？

print(model.inputs)
print(model.output_dtypes)
print(model.output_shapes)

[<tf.Tensor 'encoded_image_string_tensor:0' shape=(None,) dtype=string>, <tf.Tensor 'key:0' shape=(None,) dtype=string>, <tf.Tensor 'global_step:0' shape=() dtype=resource>]
{'detection_scores': tf.float32, 'detection_classes': tf.float32, 'num_detections': tf.float32, 'key': tf.string, 'detection_boxes': tf.float32}
{'detection_scores': TensorShape([None, 100]), 'detection_classes': TensorShape([None, 100]), 'num_detections': TensorShape([None]), 'key': TensorShape([None]), 'detection_boxes': TensorShape([None, 100, 4])}

tensorflow / models / research中的现有示例展示了如何使用image_tensor类型的输入来做到这一点：

  image = np.asarray(image)
  # The input needs to be a tensor, convert it using `tf.convert_to_tensor`.
  input_tensor = tf.convert_to_tensor(image)
  # The model expects a batch of images, so add an axis with `tf.newaxis`.
  input_tensor = input_tensor[tf.newaxis,...]

当我使用encoded_image_string_tensor作为输入在模型上运行此代码时，会产生以下错误：

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
 in 
      1 for i in range(1):
----> 2     show_inference(model, TEST_IMAGE_PATHS[i])

 in show_inference(model, image_path)
     39 #   print(image_np)
     40   # Actual detection.
---> 41   output_dict = run_inference_for_single_image(model, image_np)
     42   # Visualization of the results of a detection.
     43   print(output_dict['detection_scores'][:3])

 in run_inference_for_single_image(model, image)
      7 
      8   # Run inference
----> 9   output_dict = model(input_tensor)
     10 
     11   # All outputs are batches tensors.

~\anaconda3\envs\tf2\lib\site-packages\tensorflow\python\eager\function.py in __call__(self, *args, **kwargs)
   1603       TypeError: For invalid positional/keyword argument combinations.
   1604     """
-> 1605     return self._call_impl(args, kwargs)
   1606 
   1607   def _call_impl(self, args, kwargs, cancellation_manager=None):

~\anaconda3\envs\tf2\lib\site-packages\tensorflow\python\eager\function.py in _call_impl(self, args, kwargs, cancellation_manager)
   1622            "of {}), got {}. When calling a concrete function, positional "
   1623            "arguments may not be bound to Tensors within nested structures."
-> 1624           ).format(self._num_positional_args, self._arg_keywords, args))
   1625     args = list(args)
   1626     for keyword in self._arg_keywords[len(args):]:

TypeError: Expected at most 0 positional arguments (and the rest keywords, of ['encoded_image', 'key']), got (,). When calling a concrete function, positional arguments may not be bound to Tensors within nested structures.

Answer 1

通过使用run_inference_for_single_image()对图像进行编码，可以轻松地从object_detection_tutorial.ipynb笔记本（您在示例中似乎使用过）中修改tf.io.encode_jpeg()函数。

Google AI平台的内置对象检测模型还需要一个密钥（任何具有批处理大小维的字符串张量）作为输入，在示例中我也将其添加到了model()调用中在下面。

def run_inference_for_single_image(model, image):
    image = np.asarray(image)

    # The input needs to be a tensor, convert it using `tf.convert_to_tensor`.
    input_tensor = tf.convert_to_tensor(image)

    # Encode the (numerical) tensor into an "encoded_image"
    encoded_image = tf.io.encode_jpeg(input_tensor)

    # The model expects a batch of images, so add an axis with `tf.newaxis`.
    encoded_image = encoded_image[tf.newaxis,...]

    # Run inference (the SavedModel downloaded from AI platform also requires a "key" as input.)
    output_dict = model(encoded_image = encoded_image, key = tf.expand_dims(tf.convert_to_tensor("test_key"), 0))

    # ...

如何对以encoded_image_string_tensor作为输入的张量流模型的输入进行编码

1 个答案: