预测失败:处理输入错误:预期的字符串,得到了字典

时间:2018-06-25 15:45:49

标签: python tensorflow machine-learning google-cloud-ml

我已经完成了TensorFlow(https://www.tensorflow.org/get_started/get_started_for_beginners)的入门教程,并对代码做了一些小的更改以使其适应我的应用程序。我的案例的功能列如下:

transaction_column = tf.feature_column.categorical_column_with_vocabulary_list(key='Transaction', vocabulary_list=["buy", "rent"])
localization_column = tf.feature_column.categorical_column_with_vocabulary_list(key='Localization', vocabulary_list=["barcelona", "girona"])
dimensions_feature_column = tf.feature_column.numeric_column("Dimensions")
buy_price_feature_column = tf.feature_column.numeric_column("BuyPrice")
rent_price_feature_column = tf.feature_column.numeric_column("RentPrice")

my_feature_columns = [
    tf.feature_column.indicator_column(transaction_column),
    tf.feature_column.indicator_column(localization_column),
    tf.feature_column.bucketized_column(source_column = dimensions_feature_column,
                                        boundaries = [50, 75, 100]),
    tf.feature_column.numeric_column(key='Rooms'),
    tf.feature_column.numeric_column(key='Toilets'),
    tf.feature_column.bucketized_column(source_column = buy_price_feature_column,
                                        boundaries = [1, 180000, 200000, 225000, 250000, 275000, 300000]),
    tf.feature_column.bucketized_column(source_column = rent_price_feature_column,
                                        boundaries = [1, 700, 1000, 1300])
]

之后,我保存了模型,以便可以在Cloud ML Engine中使用它进行预测。 要导出模型,我添加了以下代码(评估模型后):

feature_spec = tf.feature_column.make_parse_example_spec(my_feature_columns)
export_input_fn = tf.estimator.export.build_parsing_serving_input_receiver_fn(feature_spec)
servable_model_dir = "modeloutput"
servable_model_path = classifier.export_savedmodel(servable_model_dir, export_input_fn)

运行代码后,我会在“ modeloutput”目录中获得适当的模型文件,并在云中创建模型(如https://cloud.google.com/ml-engine/docs/tensorflow/getting-started-training-prediction#deploy_a_model_to_support_prediction,“部署模型以支持预测”中所述)

创建模型版本后,我只是尝试在Cloud Shell上使用以下命令使用此模型启动在线预测:

gcloud ml-engine predict --model $MODEL_NAME --version v1 --json-instances ../prediction.json

其中$ MODEL_NAME是我的模型名称,prediction.json是具有以下内容的JSON文件:

{"inputs":[
  {
     "Transaction":"rent",
     "Localization":"girona",
     "Dimensions":90,
     "Rooms":4,
     "Toilets":2,
     "BuyPrice":0,
     "RentPrice":1100
  }
  ]
}

但是,预测失败,并且我收到以下错误消息:

  

“错误”:“预测失败:错误处理输入:预期的字符串,得到{u'BuyPrice':0,u'Transaction':u'rent',u'Rooms':4,u'Localization':u 'girona',u'Toilets':2,u'RentPrice':1100,u'Dimensions':90}改为“ dict”类型。”

错误很明显,应该使用字符串而不是字典。如果检查我的SavedModel SignatureDef,我将获得以下信息:

The given SavedModel SignatureDef contains the following input(s):
inputs['inputs'] tensor_info:
  dtype: DT_STRING
  shape: (-1)
  name: input_example_tensor:0
The given SavedModel SignatureDef contains the following output(s):
outputs['classes'] tensor_info:
  dtype: DT_STRING
  shape: (-1, 12)
  name: dnn/head/Tile:0
outputs['scores'] tensor_info:
  dtype: DT_FLOAT
  shape: (-1, 12)
  name: dnn/head/predictions/probabilities:0
Method name is: tensorflow/serving/classify

很明显,期望输入的dtype是字符串(DT_STRING),但是我不知道如何格式化输入数据以使预测成功。我试图以许多不同的方式编写输入JSON,但我不断出错。 如果我看一下教程(https://www.tensorflow.org/get_started/get_started_for_beginners)中的预测是如何执行的,我认为很明显,预测输入是作为字典传递的(在教程代码中为predict_x)。

那么,我在哪里错了?如何使用此输入数据进行预测?

谢谢您的时间。

基于答案的编辑------

根据@Lak的第二条建议,我更新了代码以导出模型,因此现在看起来像这样:

export_input_fn = serving_input_fn
servable_model_dir = "savedmodeloutput"
servable_model_path = classifier.export_savedmodel(servable_model_dir, 
 export_input_fn)
...

def serving_input_fn():
feature_placeholders = {
    'Transaction': tf.placeholder(tf.string, [None]),
    'Localization': tf.placeholder(tf.string, [None]),
    'Dimensions': tf.placeholder(tf.float32, [None]),
    'Rooms': tf.placeholder(tf.int32, [None]),
    'Toilets': tf.placeholder(tf.int32, [None]),
    'BuyPrice': tf.placeholder(tf.float32, [None]),
    'RentPrice': tf.placeholder(tf.float32, [None])
    }
features = {
    key: tf.expand_dims(tensor, -1)
    for key, tensor in feature_placeholders.items()
}
return tf.estimator.export.ServingInputReceiver(features, feature_placeholders)

在那之后,我创建了一个新模型,并使用以下JSON对其进行了馈送以获取预测:

{
   "Transaction":"rent",
   "Localization":"girona",
   "Dimensions":90.0,
   "Rooms":4,
   "Toilets":2,
   "BuyPrice":0.0,
   "RentPrice":1100.0
}

请注意,由于我在进行预测时收到错误“意外的张量名称:输入”,因此我从JSON结构中删除了“输入”。但是,现在我收到了一个新的更难看的错误:

  

“错误”:“预测失败:模型执行期间出错:AbortionError(code = StatusCode.INVALID_ARGUMENT,详细信息= \“ NodeDef在操作索引中未提到attr'T':int64>; NodeDef:dnn / input_from_feature_columns / input_layer / Transaction_indicator / to_sparse_input / indices = WhereT = DT_BOOL,_output_shapes = [[?, 2]],_device = \“ / job:localhost /副本:0 / task:0 / device:CPU:0 \”。(检查您的GraphDef -解释二进制文件与生成GraphDef的二进制文件是最新的。)。\ n \ t [[节点:dnn / input_from_feature_columns / input_layer / Transaction_indicator / to_sparse_input / indices = WhereT = DT_BOOL,_output_shapes = [[?, 2]], _device = \“ / job:localhost /副本:0 / task:0 / device:CPU:0 \”]] \“)”

我再次检查了SignatureDef,并获得以下信息:

The given SavedModel SignatureDef contains the following input(s):
  inputs['Toilets'] tensor_info:
      dtype: DT_INT32
      shape: (-1)
      name: Placeholder_4:0
  inputs['Rooms'] tensor_info:
      dtype: DT_INT32
      shape: (-1)
      name: Placeholder_3:0
  inputs['Localization'] tensor_info:
      dtype: DT_STRING
      shape: (-1)
      name: Placeholder_1:0
  inputs['RentPrice'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1)
      name: Placeholder_6:0
  inputs['BuyPrice'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1)
      name: Placeholder_5:0
  inputs['Dimensions'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1)
      name: Placeholder_2:0
  inputs['Transaction'] tensor_info:
      dtype: DT_STRING
      shape: (-1)
      name: Placeholder:0
The given SavedModel SignatureDef contains the following output(s):
  outputs['class_ids'] tensor_info:
      dtype: DT_INT64
      shape: (-1, 1)
      name: dnn/head/predictions/ExpandDims:0
  outputs['classes'] tensor_info:
      dtype: DT_STRING
      shape: (-1, 1)
      name: dnn/head/predictions/str_classes:0
  outputs['logits'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 12)
      name: dnn/logits/BiasAdd:0
  outputs['probabilities'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 12)
      name: dnn/head/predictions/probabilities:0
Method name is: tensorflow/serving/predict

在某些步骤中我出了错吗?谢谢!

新更新

我已经运行了局部预测,并且已成功执行,收到了预期的预测结果。使用的命令:

gcloud ml-engine local predict --model-dir $MODEL_DIR --json-instances=../prediction.json

其中MODEL_DIR是包含模型训练中生成的文件的目录。 因此,问题似乎出在导出模型上。某种程度上,导出并稍后用于预测的模型是不正确的。我读过一些有关TensorFlow版本的信息,可能是问题的根源,但我不明白。我的整个代码不是用相同的TF版本执行的吗? 关于这一点有什么想法吗?

谢谢!

2 个答案:

答案 0 :(得分:2)

问题出在您的服务输入功能上。您正在使用build_parsing_serving_input_receiver_fn,如果要发送tf,则应使用该函数。示例字符串: https://www.tensorflow.org/api_docs/python/tf/estimator/export/build_parsing_serving_input_receiver_fn

两种解决方法:

  1. 发送tf.Example

    example = tf.train.Example(features=tf.train.Features(feature=
       {'transaction': tf.train.Feature(bytes_list=tf.train.BytesList(value=['rent'])), 
        'rentPrice': tf.train.Feature(float32_list=tf.train.Float32List(value=[1000.0))
    }))

    string_to_send = example.SerializeToString()

  1. 更改提供服务的输入功能,以便您可以发送JSON:

    def serving_input_fn():
       feature_placeholders = {
                'transaction': tf.placeholder(tf.string, [None]),
                ...
                'rentPrice': tf.placeholder(tf.float32, [None]),
            }
            features = {
                key: tf.expand_dims(tensor, -1)
                for key, tensor in feature_placeholders.items()
            }
       return tf.estimator.export.ServingInputReceiver(features, feature_placeholders)


    export_input_fn = serving_input_fn

答案 1 :(得分:0)

问题已解决:)

经过几次实验,我最终发现我必须使用最新的运行时版本(1.8)创建模型:

gcloud ml-engine versions create v2 --model $MODEL_NAME --origin $MODEL_BINARIES --runtime-version 1.8