Question

我正在使用gcloud local prediction测试导出的模型。该模型是TensorFlow对象检测模型，已在自定义数据集上进行了训练。我正在使用以下gcloud命令：

gcloud ml-engine local predict --model-dir=/path/to/saved_model/ --json-instances=input.json --signature-name="serving_default" --verbosity debug

当我不使用详细命令时，该命令不会输出任何内容。将详细设置为调试后，我得到以下回溯：

DEBUG: [Errno 32] Broken pipe
Traceback (most recent call last):
  File "/google-cloud-sdk/lib/googlecloudsdk/calliope/cli.py", line 984, in Execute
    resources = calliope_command.Run(cli=self, args=args)
  File "/google-cloud-sdk/lib/googlecloudsdk/calliope/backend.py", line 784, in Run
    resources = command_instance.Run(args)
  File "/google-cloud-sdk/lib/surface/ai_platform/local/predict.py", line 83, in Run
    signature_name=args.signature_name)
  File "/google-cloud-sdk/lib/googlecloudsdk/command_lib/ml_engine/local_utils.py", line 103, in RunPredict
    proc.stdin.write((json.dumps(instance) + '\n').encode('utf-8'))
IOError: [Errno 32] Broken pipe

我的导出模型的详细信息：

MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:

signature_def['serving_default']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['inputs'] tensor_info:
        dtype: DT_STRING
        shape: (-1)
        name: encoded_image_string_tensor:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['detection_boxes'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 300, 4)
        name: detection_boxes:0
    outputs['detection_classes'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 300)
        name: detection_classes:0
    outputs['detection_features'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, -1, -1, -1, -1)
        name: detection_features:0
    outputs['detection_multiclass_scores'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 300, 2)
        name: detection_multiclass_scores:0
    outputs['detection_scores'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 300)
        name: detection_scores:0
    outputs['num_detections'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1)
        name: num_detections:0
    outputs['raw_detection_boxes'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 300, 4)
        name: raw_detection_boxes:0
    outputs['raw_detection_scores'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 300, 2)
        name: raw_detection_scores:0
  Method name is: tensorflow/serving/predict

我使用以下代码来生成用于预测的input.json：

with open('input.json', 'wb') as f:
    img = Image.open("image.jpg")
    img = img.resize((width, height), Image.ANTIALIAS)
    output_str = io.BytesIO()
    img.save(output_str, "JPEG")
    image_byte_array = output_str.getvalue()
    image_base64 = base64.b64encode(image_byte_array)
    json_entry = {"b64": image_base64.decode()}
    #instances.append(json_entry
    request = json.dumps({'inputs': json_entry})
    f.write(request.encode('utf-8'))
f.close()

{"inputs": {"b64": "/9j/4AAQSkZJRgABAQAAAQABAAD/......}}

我正在用一张图像测试预测。

Answer 1

我遇到了同样的问题，发现ml_engine/local_utils.py使用python运行为ml_engine/local_predict.pyc构建的python2.7。我的python是python3，所以当ml_engine/local_utils.py尝试使用ml_engine/local_predict.pyc（实际上是python）运行python3时，它将失败并显示错误：

RuntimeError: Bad magic number in .pyc file

解决方案1：

您只需将python2设置为系统的默认值即可。

解决方案2：

我用以下补丁更改了ml_engine/local_utils.py：

83c83
<   python_executables = files.SearchForExecutableOnPath("python")
---
>   python_executables = files.SearchForExecutableOnPath("python2")
114a115
>   log.debug(args)
124,126c125,130
<   for instance in instances:
<     proc.stdin.write((json.dumps(instance) + "\n").encode("utf-8"))
<   proc.stdin.flush()
---
>   try:
>     for instance in instances:
>       proc.stdin.write((json.dumps(instance) + "\n").encode("utf-8"))
>     proc.stdin.flush()
>   except:
>     pass

需要try-catch使脚本能够读取和打印运行ml_engine/local_predict.pyc时发生的错误。

Answer 2

根据此page，二进制输入必须后缀_bytes

在TensorFlow模型代码中，您必须为二进制输入和输出张量命名别名，以便它们以'_bytes'结尾。

尝试为您的输入加上_bytes后缀，或使用兼容的input_serving函数重建模型。

Answer 3

在运行命令本地SDK文件/usr/lib/google-cloud-sdk/lib/googlecloudsdk/command_lib/ml_engine/local_utils.py时，读取文件内容时似乎发生了故障：

  for instance in instances:
    proc.stdin.write((json.dumps(instance) + '\n').encode('utf-8'))
  proc.stdin.flush()

对于您而言，我希望看到JSON格式正确，否则我们通常会得到：

ERROR: (gcloud.ai-platform.local.predict) Input instances are not in JSON format. See "gcloud ml-engine predict --help" for details.

这是我通常用来生成具有调整大小的b64编码图像的代码的片段。

import base64
from PIL import Image

INPUT_FILE = 'image.jpg'
OUTPUT_FILE = 'image_b64.json'


def convert_to_base64_resize(image_file):
  """Open image, resize, base64 encode it and create a JSON request"""
  img = Image.open(image_file).resize((240, 240))
  img.save(image_file)  
  with open(image_file, 'rb') as f:
    jpeg_bytes = base64.b64encode(f.read()).decode('utf-8')   
    predict_request = '{"image_bytes": {"b64": "%s"}}' % jpeg_bytes 
    # Write JSON to file
    with open(OUTPUT_FILE, 'w') as f:
      f.write(predict_request)
    return predict_request

convert_to_base64_resize(INPUT_FILE)

很高兴看到您的JSON文件或图像的副本并比较内容。

对于正常的故障排除，我也使用tensorflow服务，特别是为了验证我的模型在本地工作。（TensorFlow服务支持指向GCS位置）请记住，带有json实例的本地预测需要这种格式：

{"image_bytes": {"b64": body }}

我认为经过上述建议的更改后，您的模型如下所示：

...
signature_def['serving_default']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['image_bytes'] tensor_info:
        dtype: DT_STRING
        shape: (-1)
        name: input_tensor:0
...

Answer 4

与@Roman Kovtuh不同，我能够使用python3进行跑步。但是，他用于创建异常处理程序的技术使我能够确定在流程可见的环境中未安装张量流。一旦完成，该过程就可以进行。

我对googlecloudsdk/command_lib/ml_engine/local_utils.py的更改：

106,109c106
<     try:
<       proc.stdin.write((json.dumps(instance) + '\n').encode('utf-8'))
<     except Exception as e:
<       print(f'Error displaying errors with instance {str(instance)[:100]}.  Exception {e}')
---
>     proc.stdin.write((json.dumps(instance) + '\n').encode('utf-8'))

我赞成@Roman Kovtuh，因为这确实有所帮助。

gcloud局部预测问题

4 个答案:

解决方案1：

解决方案2：