通过CloudML获取TFrecord的批量预测

时间:2018-09-12 20:59:01

标签: python tensorflow machine-learning google-cloud-ml tfrecord

我遵循了this great tutorial,并成功地在CloudML上训练了一个模型。我的代码也可以离线进行预测,但是现在我正尝试使用Cloud ML进行预测并遇到一些问题。

要部署我的模型,我遵循了this tutorial。现在,我有一个通过TFRecords生成apache_beam.io.WriteToTFRecord的代码,我想对这些TFRecords进行预测。为此,我正在关注this article,我的命令如下所示:

gcloud ml-engine jobs submit prediction $JOB_ID --model $MODEL --input-paths gs://"$FILE_INPUT".gz --output-path gs://"$OUTPUT"/predictions --region us-west1 --data-format TF_RECORD_GZIP

但是我只有错误: 'Exception during running the graph: Expected serialized to be a scalar, got shape: [64]

似乎它期望数据采用不同的格式。我找到了JSON here的格式规范,但找不到如何使用TFrecords进行格式化。

更新:这是saved_model_cli show --all --dir

的输出
MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:

signature_def['prediction']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['example_proto'] tensor_info:
    dtype: DT_STRING
    shape: unknown_rank
    name: input:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['probability'] tensor_info:
    dtype: DT_FLOAT
    shape: (1, 1)
    name: probability:0
  Method name is: tensorflow/serving/predict

signature_def['serving_default']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['example_proto'] tensor_info:
    dtype: DT_STRING
    shape: unknown_rank
    name: input:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['probability'] tensor_info:
    dtype: DT_FLOAT
    shape: (1, 1)
    name: probability:0
  Method name is: tensorflow/serving/predict

1 个答案:

答案 0 :(得分:2)

导出模型时,需要确保它是“可批量的”,即输入占位符的外部尺寸为shape=[None],例如

input = tf.Placeholder(dtype=tf.string, shape=[None])
...

这可能需要稍微修改图形。例如,我看到您输出的形状被硬编码为[1,1]。最外面的尺寸应为None,这可能会在您固定占位符时自动发生,或者可能需要进行其他更改。

鉴于输出的名称为probabilities,我也希望最里面的维度为> 1,即所预测的类数,因此类似[None, NUM_CLASSES]